Here’s How Our College Football Playoff Predictions Work

UPDATE (Nov. 1, 2016; 7:30 p.m.): Our 2016 College Football Predictions follow the same methodology as our predictions from last year did, except they will update more frequently. Check out the article below for more details.

Are you ready for some football? Or at least, some vociferous arguing about football?

Last season’s first-ever College Football Playoff taught us two things. First, going from a championship game to a four-team playoff won’t end the annual bickering over which teams belong. Second, the playoff selection committee seemingly takes a different approach than voters traditionally have in the coaches’ and media polls. In particular, they are more willing to scramble teams’ positions from week to week, even when everyone wins out.

Florida State, for instance, despite never losing during the regular season, moved from No. 2 in the committee’s initial rankings to No. 4 on Dec. 2, before being upgraded to No. 3 in the committee’s final rankings on Dec. 7. More consequentially, TCU dropped from No. 3 to No. 6 — and out of the playoff — in the committee’s final rankings despite having won its final game over Iowa State 55-3. Historically, it’s unusual in the coaches’ and media polls for a team to lose ground after winning a game, and there’s almost no precedent for a team dropping three spots, as TCU did.

You can spin all of this as a good thing (“the committee starts with a blank slate every week and gives each team a fair shake!”) or a bad one (“the committee is inconsistent and indecisive!”). Either way, it has implications if you’re trying to anticipate the committee’s next move — as we, like so many other college football fans, are foolishly trying to do.

Our way is pretty geeky and statistical, of course. Last season, we introduced a model that simulated the rest of the college football season and sought to forecast which four teams would make the playoff. We described the model as “speculative” since the committee was a new thing, giving us no historical data to measure its behavior; instead, we used the historical behavior of voters in the coaches’ poll as a substitute.

For several weeks, the model seemed to be uncannily accurate! Then … a (mild?) disaster. Our final set of simulations gave TCU a 91 percent chance of making the playoff, Florida State a 68 percent chance, and Ohio State a 40 percent chance. But TCU, the safest bet according to the model¹, was the one left out.

While we could have sheepishly attributed this result to “bad luck” — a 91 percent chance isn’t a 100 percent chance — we don’t think that’s the right conclusion here. Instead, after a year’s worth of experience under our belts that let us see how the committee works, we’re making a couple of revisions to the model:

First, we account for the committee’s potential to scramble the ratings slightly from week to week, even where the on-field action didn’t seem to warrant it, such as in its flip-flopping of Florida State and TCU last year. The mechanics of this are a little involved; I’ll describe them briefly down below.
Second, we assign a small bonus² to conference champions.³
Third, the model accounts for slightly more uncertainty than in last year’s version.

These aren’t huge changes — the backbone of the model is the same as last year — but for what it’s worth, the revised version of the model would have correctly predicted the top four seeds last season, and in the right order. (Both Florida State and Ohio State would have been projected to leapfrog ahead of TCU in the committee’s final standings.) I say “for what it’s worth” because it’s not very much of an accomplishment to “predict” something after it’s already happened. But having one year’s worth of data on the committee’s behavior is a lot better than none.

The model works by simulating out the rest of the season thousands of times. It’s iterative, meaning that it does this week by week. The process works like this:

Start with Week 10 committee standings.
Simulate Week 10 games.
Forecast how Week 11 committee standings will change in response to simulated Week 10 games.
Simulate Week 11 games.
Rinse and repeat until you get to the final committee rankings on Dec. 6.

The model also simulates conference championship games and the four-team playoff itself. Thus, it provides a probabilistic estimate of each team’s chances of winning its conference, making the playoff, and winning the national championship.

We’ll be updating the numbers twice weekly: first, on Sunday morning (or very late Saturday evening) after the week’s games are complete; and second, on Tuesday evening after the new committee rankings come out. In addition to a probabilistic estimate of each team’s chances of winning its conference, making the playoff, and winning the national championship, we’ll also list three inputs to the model: their current committee ranking, FPI, and Elo. Let me explain the role that each of these play.

	Ranking			Probability of …
Team	CFP	Elo	FPI	Conf. Title	Playoffs	Nat. Title
Ohio State	3	1	4	47%	61%	16%
Clemson	1	7	7	56%	51%	12%
Alabama	4	2	6	14%	41%	11%
TCU	8	4	2	37%	31%	11%
Baylor	6	10	1	32%	31%	13%
LSU	2	5	8	22%	30%	8%
Notre Dame	5	8	9	—	25%	5%
Michigan State	7	3	19	15%	22%	3%
Stanford	11	6	13	46%	19%	3%
Florida	10	9	12	41%	18%	4%
Oklahoma	15	16	3	15%	14%	5%
Mississippi	18	17	10	20%	8%	2%
Iowa	9	12	29	25%	7%
Michigan	17	22	18	7%	6%
Oklahoma St.	14	11	14	15%	6%	1%
Utah	12	15	21	18%	6%
Memphis	13	14	36	21%	6%
Florida State	16	13	15	13%	5%
USC	—	20	5	30%	4%	1%
Mississippi St.	20	19	17		3%
Houston	25	23	33	30%	2%
UCLA	23	21	22	5%	1%
North Carolina	—	26	23	23%
Toledo	24	24	43	28%
Temple	22	32	45	41%
Oregon	—	25	32
Wisconsin	—	18	24	5%
Texas A&M	19	30	16
Arkansas	—	39	26
Penn State	—	27	41
Northwestern	21	42	57

Remember: the committee rankings are a starting point for the model and not the ending point. At this relatively early point in the season, the committee standings won’t matter very much; there are too many opportunities for the teams to be scrambled later on. (Consider, for instance, that eventual national champion Ohio State started out at No. 16 last year.) They’ll tend to matter more as the season goes along, although, as we saw with TCU last year, nothing except for the committee’s final rankings are all that definitive.

FPI is ESPN’s Football Power Index. We consider it the best predictor of future college games so that’s the role it plays in the model: if we say Team A has a 72 percent chance of beating Team B, that prediction is derived from FPI. Technically speaking, we’re using a simplified version of FPI that accounts for only each team’s current rating and home field advantage; the FPI-based predictons you see on ESPN.com may differ slightly because they also account for travel distance and days of rest.

But if FPI is good at predicting, it’s not very “politically correct,” meaning that it deliberately doesn’t care about how human beings might rank the teams. For instance, FPI currently has a USC with three losses as the fifth best team in the country — ahead of undefeated Clemson! Committee voters would never do that.

Instead, that’s the role that our college football Elo ratings play. If you’re familiar with FiveThirtyEight, you’ll be familiar with Elo ratings. They’re a simple mathematical system that form the basis of our NFL forecasts, for instance. We’ve also applied Elo to soccer, the NBA, basketball and other sports.

Our college football Elo ratings are a little different, however. Instead of being designed to maximize predictive accuracy — we have FPI for that — they’re designed to mimic how humans rank the teams instead.⁴ Their parameters are set so as to place a lot of emphasis on strength of schedule and especially on recent “big wins,” because that’s what human voters have historically done too. They aren’t very forgiving of losses, conversely, even if they came by a narrow margin under tough circumstances. And they assume that, instead of everyone starting with a truly blank slate, human beings look a little bit at how a team fared in previous seasons. Alabama is more likely to get the benefit of the doubt than Vanderbilt, for example, other factors held equal.

How do Elo ratings help the model? As it plays out each week of the season, the model forecasts each team’s new projected ranking based on a combination of its committee ranking in the previous week, the game result (as simulated by FPI) and its Elo rating. In other words, Elo ratings form a counterbalance against the committee rankings, which as we’ve seen can be subject to change. Last year, for instance, Elo had Florida State ranked very highly: As an undefeated returning national champion, the Seminoles had the profile of a team that human voters typically love. Elo also had Ohio State ranked highly, well ahead of TCU. Thus, the model wouldn’t have been so surprised that Florida State and Ohio State jumped ahead of TCU in the final standings.⁵

If the rankings still look a little off to you — if you can’t quite figure out how a team gets to where it does based on Elo, FPI and its current committee ranking — there’s one other likely culprit, which is a team’s future strength of schedule. LSU, for instance, is given only a 30 percent chance of making the playoff in part because they have a brutal schedule ahead, with games against Alabama (this weekend), Mississippi and Texas A&M — plus a potential SEC Championship game against Florida. If a team has already taken a loss or two and is currently out of the running, however, a tougher upcoming strength of schedule may help it, because it means that the team has more opportunities to impress the committee and get it to reconsider.

Most importantly of all, there’s still a lot of football left to be played. It’s hard for any team to run the table, and even current front-runners like Ohio State and Baylor won’t be safe if they endure a loss. Thus, only two teams (Ohio State and Clemson), start with more than a 50 percent chance of making the playoff in our initial forecast.

Footnotes

After sure-things Alabama and Oregon, to whom the model gave a 100 percent chance.
Because we don’t have a firm idea of how much the committee rewards conference champions, we treat the magnitude of the conference championship bonus as being uncertain, and it varies from simulation to simulation. In some simulations, winning the conference championship is associated with a fairly large bonus; in others, it’s associated with no bonus at all. (The bonus can never be negative, however.) On average, however, it’s fairly small, and it acts as the equivalent of a tiebreaker in otherwise close cases.
The program breaks all ties for conference championships based on head-to-head results among the tied teams, and then randomly if the tie remains unresolved. We may build in more complex tiebreaking rules later in the season to the extent they become relevant.
As based on a historical analysis of the coaches’ poll and last year’s week-to-week committee standings.
I don’t want to overstate the importance of the Elo ratings, either. They make up about 20 percent of the weight in the model (the exact fraction varies slightly from simulation to simulation), and they’ll usually be pretty well correlated with the committee rankings. So while they might result in a team being projected to move up from fifth to fourth, they won’t usually imply wholesale changes.

Footnotes

Comments