Make College Football Great Again

Ohio State somewhat embarrassed the Big Ten in getting shut out by Clemson 31-0 in the College Football Playoff semifinal last week. Still, hindsight is 20/20, and I don’t necessarily begrudge the playoff selection committee for having turned down Penn State, which won the Big Ten championship, in favor of the Buckeyes. Ohio State was probably the better regular-season team and had fewer losses against a tougher schedule. Penn State — which for its part blew a big lead to lose the Rose Bowl to USC — had a head-to-head win against Ohio State and the conference title, two factors the committee explicitly says it considers in ranking the teams. It was a tough decision.

My point is simply this: Conference championships, as currently devised, don’t make much sense. Because of imbalanced divisions, championship games often don’t pit the two best teams in a conference against each other (Big Ten championship participant Wisconsin was probably the fourth-best team in its league, for instance). They’ll sometimes result in an awkward rematch of a game that was already played during the regular season. And conference championship games waste a weekend that could be better spent on something else, such as expanding the College Football Playoff to six or eight teams.

And now we have pretty good evidence that the playoff selection committee doesn’t really care one way or another. So let’s get rid of them! Imagine a world in which we’re spared the annual indignation of having to watch Florida lose to Alabama 59-2. Imagine a world in which historical rivals always play each other every year and yet, by almighty Rockne, the best teams in a conference always play one another, too. Imagine a world with no divisions. By which I mean: a world in which we eliminate divisions such as the ACC’s perplexingly named Atlantic and Coastal divisions, and all teams within the same college football conference compete as one.

Not only have I imagined such a world, my friends, but I have seen one. I have seen it in the hallways of a high-school debate tournament.

High-school debate tournaments — all of you will be shocked to learn that I was a master debater in high school — face some of the same constraints that college football conferences do. In any given tournament, there are lots of teams of radically varying quality levels, and there’s not nearly enough time to have them all play one another. A typical debate tournament, for example, might involve 60 teams but only six rounds of competition, with the best eight or 16 teams advancing to the playoffs (or what debaters call the “outrounds”). Each round is precious, and you don’t necessarily want to watch some some pimply-faced sophomores from a Class D school debating a Class A juggernaut like my alma mater, East Lansing High School, any more than you want to watch Rutgers lose to Michigan 78-0.

The solution that debate tournaments devised is something called power-pairing. Power-pairing just means that teams with the same record are paired off against each other, so that a team that starts off the tournament 2-0 will face off against another 2-0 team, for instance. It usually works by drawing the first two rounds of a tournament at random,¹ and after that, everything is power-paired.

This turns out to be a surprisingly elegant solution. It helps to make the matchups relatively even, which not only helps students to learn more but also usually tells you more in determining the best teams. Furthermore, the pairings are somewhat self-correcting. Suppose a good team happens to randomly draw very tough opponents in its first two rounds and gets off to an 0-2 start. They’ll receive some compensation by being paired with easier opponents the rest of the way out — an 0-2 team and then a 1-2 team, and so forth. As another bonus to this system, the best teams are put through the gantlet and really earn their keep. A team that finishes its tournament undefeated or with just one loss will have beaten a lot of very good teams along the way.

What would power-pairing look like in the context of a college football season? Here’s an example that I drew up involving this year’s Big Ten. I experimented with a few different setups, and happen to like this one, but feel free to disagree with the particulars (this is more a proof-of-concept than anything I’ve thought all that much about).

It works like this: Each team plays nine conference games, the same number they play under the Big Ten’s current rules. Five of these are scheduled in advance, while four are power-paired or “flex” matchups determined only once the season is underway. To be more specific:

Teams play rivalry games in weeks 2, 4 and 7. These matchups are the same every year. Week 7 features the most storied rivalries such as Michigan vs. Ohio State — the games that the Big Ten currently plays in the last week of the season. The games in weeks 2 and 4 involve secondary or tertiary rivals, such as Ohio State vs. Illinois or Michigan vs. Minnesota. Granted, this doesn’t always work out perfectly, since some teams (such as Michigan) have lots of Big Ten rivals and others (here’s looking at you, Maryland) don’t really have any. In real life, you might retain some of these games but have others chosen on a random or rotating basis.
The matchups in weeks 1 and 3 are based on the previous season’s standings. Week 1 is a high-low pairing (the best teams from the previous season play the worst teams) while Week 3 is a high-high pairing (the best teams play the best teams and the worst teams play the worst teams). In theory, this gives each team one relatively tough and one relatively easy matchup within the first few weeks of the season.
Weeks 5, 6, 8 and 9 are flex or power-paired matchups, where teams are paired against others with similar records that they haven’t played previously and that they aren’t already scheduled to play against in the future. (I’ll describe the procedure for pairing teams in a moment.) Each team has two home flex games and two away flex games, with the weeks designated in advance: For instance, Penn State has away games in weeks 5 and 9 and home flex games in weeks 6 and 8. Home and away weeks are set up such that every team has the opportunity to play every other team at least once.²

Here’s what a schedule would look like with these rules in place:

A Big Ten schedule with predetermined and power-paired games

Week	Illinois	Indiana	Iowa	Maryland	Michigan
1	Michigan	Wisconsin	Purdue	@ Mich. St.	@ Illinois
2	Ohio St.	@ Mich. St.	N’western	@ Penn St.	Minnesota
3	@ Rutgers	Minnesota	@ Mich. St.	Purdue	@ Wisconsin
4	Indiana	@ Illinois	@ Minnesota	@ N’western	Mich. St.
5	@ TBD	TBD	TBD	@ TBD	TBD
6	TBD	TBD	@ TBD	@ TBD	@ TBD
7	@ N’western	Purdue	Nebraska	Rutgers	@ Ohio St.
8	@ TBD	@ TBD	@ TBD	TBD	TBD
9	TBD	@ TBD	TBD	TBD	@ TBD
Week	Mich. St.	Minnesota	Nebraska	Northwestern	Ohio State
1	Maryland	Penn State	Ohio St.	@ Rutgers	@ Nebraska
2	Indiana	@ Michigan	@ Purdue	@ Iowa	@ Illinois
3	Iowa	@ Indiana	Penn St.	Ohio St.	@ N’western
4	@ Michigan	Iowa	Wisconsin	Maryland	Penn St.
5	@ TBD	TBD	TBD	@ TBD	@ TBD
6	TBD	@ TBD	@ TBD	TBD	TBD
7	@ Penn St.	@ Wisconsin	@ Iowa	Illinois	Michigan
8	TBD	TBD	@ TBD	@ TBD	@ TBD
9	@ TBD	@ TBD	TBD	TBD	TBD
Week	Penn State	Purdue	Rutgers	Wisconsin
1	@ Minnesota	@ Iowa	N’western	@ Indiana
2	Maryland	Nebraska	Wisconsin	@ Rutgers
3	@ Nebraska	@ Maryland	Illinois	Michigan
4	@ Ohio St.	Rutgers	@ Purdue	@ Nebraska
5	@ TBD	TBD	@ TBD	TBD
6	TBD	TBD	@ TBD	@ TBD
7	Mich. St.	@ Indiana	@ Maryland	Minnesota
8	TBD	@ TBD	TBD	TBD
9	@ TBD	@ TBD	TBD	@ TBD

I’m going to proceed — fairly quickly — through a simulation of this schedule, in order to show you how the power-pairings would work. If a matchup was actually played in real life during the 2016 Big Ten regular season, I abided by the original result — so Ohio State still beats Michigan, for instance.³ Otherwise, I simulated the result using ESPN’s Football Power Index, accounting for home-field advantage. Based on FPI, for instance, Iowa would have an 87 percent chance of winning a home game against Maryland, a matchup that didn’t occur in the actual Big Ten schedule but which could occur under power-pairing.

We’ll zoom ahead to Week 5, when we encounter our first flex-scheduling week. (To see the simulated results for every game, scroll down to the big table toward the end of this article.) Here’s how it works: We take the 14 Big Ten teams and split them into pools of seven home teams and seven away teams based on where they’d been assigned to play ahead of time. We then have to pair the teams so as to give each one exactly one opponent for the week. There are, in theory, 5,040 possible ways to do this. An algorithm sorts through each of the combinations to find the best possible set of pairings, using the following rules:

It eliminates all combinations that involve a game that was already played or which was already scheduled to be played. This cuts down on the number of legal combinations quite a lot — to about 600 for Week 5, for example.
From among the remaining combinations, the algorithm finds those cases where the win totals match up as well as possible.⁴
If several combinations are tied after Steps 1 and 2, the algorithm picks the set of matchups that are least likely to occur in the future, based on how the teams are assigned to home and away games in subsequent flex weeks.⁵
If several combinations are still tied for being the most optimal after Steps 1, 2 and 3, the algorithm picks one of them at random.

Here’s what the algorithm came up with for Week 5, for example:

Power-paired Week 5 matchups in hypothetical Big Ten schedule
HOME POOL		ROAD POOL
TEAM	RECORD	TEAM	RECORD
Michigan	4-0	Penn State	4-0
Indiana	3-1	Ohio State	3-1
Iowa	3-1	Maryland	3-1
Wisconsin	2-2	Michigan State	0-4
Minnesota	1-3	Rutgers	1-3
Nebraska	1-3	Northwestern	1-3
Purdue	1-3	Illinois	1-3

That worked out pretty nicely — 12 of the 14 teams were power-paired against an opponent with the same win total, generating a key early matchup between 4-0 Michigan and 4-0 Penn State. Still, the home pool was slightly stronger than the road pool and some team had to draw the short end of the stick. It turned out to be 0-4 Michigan State, which was matched up against 2-2 Wisconsin.

From there, Michigan beat Penn State in that matchup of undefeateds to go to 5-0. Meanwhile, a couple of overachieving 3-1 teams encountered a dose of reality against stiffer competition, as Indiana lost to Ohio State and Maryland lost to Iowa. That’s one of the benefits of power-pairing teams: The pretenders who benefited from quirky wins are fairly quickly weeded out because they face a tougher schedule.

Since hearing about a hypothetical college football season is about as exciting as someone else’s fantasy football team, we’ll work through the rest of the schedule quickly. Ohio State ruined its chances by losing to Iowa in Week 6 (in a matchup that didn’t occur in real life). After Week 8, Penn State and Michigan both wound up at 7-1, with Michigan in the driver’s seat for the conference championship by virtue of having defeated Penn State in Week 5. However, Michigan drew a tough matchup against Iowa in Week 9, which it lost, while Penn State (having already defeated most of the good teams in the conference) beat Illinois to win the conference title. Here are all the simulated games in one chart, in case you want to see the dirty detail:

Big Ten simulated schedule with power-paired matchups

Week	Illinois	Indiana	Iowa	Maryland	Michigan
1	L–Michigan	W–Wisconsin	W–Purdue	W–Mich. St.	W–Illinois
2	L–Ohio St.	W–Mich. St.	L–N’Western	L–Penn St.	W–Minnesota
3	W–Rutgers	L–Minnesota	W–Mich. St.	W–Purdue	W–Wisconsin
4	L–Indiana	W–Illinois	W–Minnesota	W–N’Western	W–Mich. St.
5	L–Purdue	L–Ohio St.	W–Maryland	L–Iowa	W–Penn St.
6	L–Nebraska	W–Maryland	W–Ohio St.	L–Indiana	W–Purdue
7	L–N’Western	W–Purdue	W–Nebraska	W–Rutgers	L–Ohio St.
8	W–Mich. St.	L–Michigan	L–Penn St.	L–Ohio St.	W–Indiana
9	L–Penn St.	W–Rutgers	W–Michigan	L–Minnesota	L–Iowa
Week	Michigan St.	Minnesota	Nebraska	Northwestern	Ohio State
1	L–Maryland	L–Penn St.	L–Ohio St.	L–Rutgers	W–Nebraska
2	L–Indiana	L–Michigan	W–Purdue	W–Iowa	W–Illinois
3	L–Iowa	W–Indiana	L–Penn St.	L–Ohio St.	W–N’Western
4	L–Michigan	L–Iowa	L–Wisconsin	L–Maryland	L–Penn St.
5	L–Wisconsin	W–Rutgers	W–N’Western	L–Nebraska	W–Indiana
6	W–Rutgers	W–N’Western	W–Illinois	L–Minnesota	L–Iowa
7	L–Penn St.	L–Wisconsin	L–Iowa	W–Illinois	W–Michigan
8	L–Illinois	W–Purdue	W–Rutgers	L–Wisconsin	W–Maryland
9	L–Nebraska	W–Maryland	W–Mich. St.	W–Purdue	W–Wisconsin
Week	Penn State	Purdue	Rutgers	Wisconsin
1	W–Minnesota	L–Iowa	W–N’Western	L–Indiana
2	W–Maryland	L–Nebraska	L–Wisconsin	W–Rutgers
3	W–Nebraska	L–Maryland	L–Illinois	L–Michigan
4	W–Ohio St.	W–Rutgers	L–Purdue	W–Nebraska
5	L–Michigan	W–Illinois	L–Minnesota	W–Mich. St.
6	W–Wisconsin	L–Michigan	L–Mich. St.	L–Penn St.
7	W–Mich. St.	L–Indiana	L–Maryland	W–Minnesota
8	W–Iowa	L–Minnesota	L–Nebraska	W–N’Western
9	W–Illinois	L–N’Western	L–Indiana	L–Ohio St.

For me at least, that feels a lot cleaner than having a conference championship game. Thanks to power-pairing, the top four finishers — Penn State, Iowa, Ohio State and Michigan in our simulation — all played one another, so a championship game wouldn’t have left a lot more to prove or disprove.

It’s true that we got slightly lucky in this simulation by having a lone champion (Penn State) instead of a tie. But the bounty of head-to-head games between the top teams under power pairing makes potential ties easier to break, because the best teams would play each other more often.

I hear what you’re saying: Penn State beat Ohio State in the real-life Big Ten and the committee chose to ignore that, or at least to de-emphasize it. I certainly don’t mean to suggest that power-pairing would remove every controversy. But in the spirit of a team debate, I have a couple of rebuttals.

First, power-pairing would create a higher number of meaningful games, making it more likely that disputes would be settled on the field. In our simulated season, Penn State played (and defeated) Wisconsin, Nebraska and Illinois, a decent group of opponents whom they didn’t play in the actual regular season,⁶ but skipped games against mediocre Indiana, Purdue and Rutgers, whom they pointlessly faced in real life. That made Penn State’s schedule harder and made its one-loss conference record even more impressive. On the flip side, Ohio State’s schedule got tougher also,⁷ but they couldn’t handle the heat, blowing a game against Iowa that they didn’t have to play in real life. This is the algorithm working as intended: It improves the résumés of the very best teams while also thinning out the crop with (at least theoretically) entertaining games against closely matched opponents.

Second, power-pairing would make teams easier to compare, by eliminating divisions and the potential ambiguities created by conference championship games (such as if Florida had become the nominal conference champion despite having more losses because it beat Alabama in the SEC championship). The top teams would simply be those that won the most games from the start of the regular season to the finish. And under power-pairing, the top teams would usually play one another, further aiding comparison.

And third, eliminating conference championship games would free up a week in the schedule, so we could tack on another round to the College Football Playoff without further bloating the college football schedule. That would make it easier for strong conferences such as the Big Ten to place two or three teams into the playoff when deserving.

It isn’t a perfect system, and it’s easy enough to imagine what some of the complaints would sound like. A team’s partisans would curse “the computer” every time the algorithm came up with an opponent they didn’t like. Coordinating travel logistics would become mildly more annoying. But power-pairing would get the best teams in the conference to play one another more often and create more deserving conference champions. It might be a nerdy solution, but it would make for better football.

Footnotes

Alternatively, the teams may be seeded somehow, such that everyone starts out with one matchup against an experienced team and another matchup against an inexperienced team in their first two rounds, for example.
For example, since Michigan State and Ohio State weren’t originally scheduled to play one another, there has to be at least one flex week where one of them is scheduled to be on the road and the other is scheduled to be at home.
This holds even if there’s a different home team than in the original matchup.
More specifically, it identifies cases where the average number of wins separating the paired teams is the smallest. It’s best to pair three-win home team Indiana against a three-win team from the road pool, for instance. If you can’t do that, then pairing Indiana against a four-win team or a two-win team is the next-best option.
For instance, Nebraska and Northwestern are both scheduled to play on the road in Week 8 and both scheduled to play at home in Week 9, so if they aren’t matched up against each other in Week 5, the only other chance is Week 6. The algorithm will prioritize that matchup before others in which teams have several more opportunities to face each other.
Penn State played Wisconsin in the Big Ten championship game, but not in the regular season.
Ohio State played Iowa and Illinois in our simulated season, sacrificing real-life games against Michigan State and Rutgers.

FiveThirtyEight

Make College Football Great Again

By making it more like high-school debate. Hear me out.

Footnotes

Comments