How FiveThirtyEight Is Forecasting The 2017 NCAA Tournament

Editor’s note: This article is an adapted version of one we published last year about how our March Madness predictions work.

Welcome to FiveThirtyEight’s March Madness predictions of the men’s and women’s NCAA basketball tournaments. We’ve been issuing probabilistic March Madness forecasts in some form since 2011, when FiveThirtyEight was just a couple of us writing for The New York Times.

Here’s how we computed everything in this year’s forecast.

Live win probabilities

Our interactive graphic will include a dashboard that shows the score and time remaining in every game as it’s played, as well as the chance that each team will win that game. These probabilities are derived using logistic regression analysis, which lets us plug the current state of a game into a model to produce the probability that either team wins the game. Specifically, we used play-by-play data from the past five seasons of Division I NCAA basketball to fit a model that incorporates:

Time remaining in the game
Score difference
Pre-game win probabilities
Which team has possession, with a special adjustment if the team is shooting free throws.

These in-game win probabilities won’t account for everything. If a key player has fouled out of a game, for example, his or her team’s win probability is probably a bit lower than we’ve listed. There are also a few places where the model experiences momentary uncertainty: In the handful of seconds between the moment when a player is fouled and the free throws that follow, we use the team’s average free-throw percentage. Still, these probabilities ought to do a reasonably good job of showing which games are competitive and which are in the bag.

We built a separate in-game probability model for the women’s tournament that works in exactly the same way but uses historical women’s data. Thus, we’ll be updating our forecasts live for both the men’s and women’s tournament.

Excitement index

Our March Madness “excitement index” (loosely based on Brian Burke’s NFL work) is a measure of how much each team’s chances of winning changed over the course of the game and is a good reference for picking the best games to flip to.

The calculation is simple: It’s the average change in win probability per basket scored, weighted by the amount of time remaining in the game. This means that a late-game basket has more influence on a game’s rating than a basket near the beginning of the game. We give additional weight to changes in win probability in overtime. Ratings range from 0 to 10, except in extreme cases where they can exceed 10.

Elo ratings

Otherwise, the methodology for our men’s forecasts is also largely the same as last year. But we’ve developed our own computer rating system — Elo — which we include along with the five computer rankings and two human rankings we used previously.

If you’ve followed FiveThirtyEight, you’ll know that we’re big fans of Elo ratings, which we’ve introduced for the NBA, the NFL and other sports. We’ve now applied them for men’s college basketball teams dating back to the 1950s, using game data from ESPN, Sports-Reference.com and other sources.

Our methodology for calculating these Elo ratings is highly similar to the one we use for NBA. They rely on relatively simple information — specifically, the final score, home-court advantage, and the location of each game. (College basketball teams perform significantly worse when they travel a long distance to play a game.) They also account for a team’s conference — at the beginning of each season, a team’s Elo rating is regressed toward the mean of other schools in its conference — and whether the game was an NCAA Tournament game. We’ve found that historically, there are actually fewer upsets in the NCAA Tournament than you’d expect from the difference in teams’ Elo ratings, perhaps because the games are played under better and fairer conditions in the tournament than in the regular season. Our Elo ratings account for this and also weight tournament games slightly higher than regular season ones.

Elo ratings for the 68 teams to qualify for the men’s tournament follow below.

2017 NCAA Tournament team ratings
			RATINGS		PROBABILITY OF…
TEAM	REGION	SEED	ELO	COMPOSITE	FINAL 4	CHAMPS
Villanova	East	1	2142	95.2	40.2%	15.0%
Gonzaga	West	1	2029	93.7	41.5	13.8
Kansas	Midwest	1	2058	92.2	38.0	10.4
Kentucky	South	2	2054	92.3	30.2	8.2
North Carolina	South	1	2030	91.7	29.9	7.0
Duke	East	2	2044	92.3	23.7	6.7
Louisville	Midwest	2	1978	90.8	21.6	5.0
Arizona	West	2	2038	89.0	16.1	4.4
West Virginia	West	4	1966	90.8	14.7	3.5
UCLA	South	3	1965	88.0	9.8	2.5
Virginia	East	5	1924	90.0	9.6	2.5
Saint Mary’s (CA)	West	7	1888	87.4	11.8	2.1
Purdue	Midwest	4	1932	88.6	10.6	2.0
Wichita State	South	10	1972	88.9	8.4	2.0
Southern Methodist	East	6	2019	88.4	7.2	1.7
Iowa State	Midwest	5	1959	87.9	9.0	1.7
Baylor	East	3	1925	87.7	6.4	1.4
Oregon	Midwest	3	2026	87.3	6.6	1.2
Butler	South	4	1892	86.5	8.6	1.1
Florida	East	4	1946	87.8	5.7	1.1
Florida State	West	3	1897	87.2	7.0	1.0
Cincinnati	South	6	1903	87.4	5.3	0.9
Wisconsin	East	8	1874	87.8	4.4	0.9
Michigan	Midwest	7	1968	86.9	5.0	0.8
Notre Dame	West	5	1932	86.7	3.9	0.6
Creighton	Midwest	6	1887	84.4	2.8	0.4
Oklahoma State	Midwest	10	1863	84.7	2.0	0.3
Miami (FL)	Midwest	8	1867	84.6	1.6	0.2
Arkansas	South	8	1827	83.2	1.7	0.2
Vanderbilt	West	9	1816	83.8	1.3	0.1
Rhode Island	Midwest	11	1847	84.0	1.3	0.1
Kansas State	South	11	1745	83.1	0.8	0.1
South Carolina	East	7	1745	83.1	1.1	0.1
Seton Hall	South	9	1864	83.0	1.2	0.1
Dayton	South	7	1800	82.8	1.1	0.1
Marquette	East	10	1830	83.0	0.9	0.1
Michigan State	Midwest	9	1791	82.8	1.0	<0.1
Wake Forest	South	11	1797	83.0	0.7	<0.1
Xavier	West	11	1773	82.3	0.9	<0.1
Virginia Commonwealth	West	10	1823	82.9	0.9	<0.1
Middle Tennessee	South	12	1816	81.3	1.2	<0.1
Maryland	West	6	1754	82.5	0.9	<0.1
Northwestern	West	8	1764	82.6	0.8	<0.1
Minnesota	South	5	1827	81.2	1.0	<0.1
Providence	East	11	1805	81.8	0.3	<0.1
Southern California	East	11	1764	81.2	0.2	<0.1
Nevada	Midwest	12	1827	80.7	0.2	<0.1
Princeton	West	12	1824	80.0	0.2	<0.1
North Carolina-Wilmington	East	12	1798	80.2	0.2	<0.1
Virginia Tech	East	9	1822	80.0	0.1	<0.1
Vermont	Midwest	13	1786	79.5	0.1	<0.1
Bucknell	West	13	1679	77.9	0.1	<0.1
East Tennessee State	East	13	1721	78.1	0.1	<0.1
Winthrop	South	13	1664	75.5	0.1	<0.1
Florida Gulf Coast	West	14	1619	75.8	<0.1	<0.1
New Mexico State	East	14	1630	75.6	<0.1	<0.1
Iona	Midwest	14	1608	75.5	<0.1	<0.1
Kent State	South	14	1625	74.3	<0.1	<0.1
Troy	East	15	1643	73.3	<0.1	<0.1
Northern Kentucky	South	15	1614	72.8	<0.1	<0.1
South Dakota State	West	16	1624	72.8	<0.1	<0.1
North Dakota	West	15	1591	72.3	<0.1	<0.1
Texas Southern	South	16	1502	71.0	<0.1	<0.1
Jacksonville State	Midwest	15	1548	71.2	<0.1	<0.1
North Carolina Central	Midwest	16	1513	71.0	<0.1	<0.1
UC-Davis	Midwest	16	1528	69.9	<0.1	<0.1
Mount St. Mary’s	East	16	1454	69.8	<0.1	<0.1
New Orleans	East	16	1524	69.2	<0.1	<0.1

Note, however, that Elo is still just one of six computer rankings that we use for the men’s tournament. The other five are ESPN’s BPI, Jeff Sagarin’s “predictor” ratings, Ken Pomeroy’s ratings, Joel Sokol’s LRMC ratings, and Sonny Moore’s computer power ratings. In addition, we use two human-generated rating systems: the selection committee’s 68-team “S-Curve”, and a composite of preseason ratings from coaches and media polls. The eight systems — six computer-generated and two human-generated — are weighted equally in coming up with a team’s overall rating.

We’ve calculated Elo ratings for men’s teams only. For women’s ratings, we rely on the same composite of ratings systems that we used last year. You can find more about the methodology for our women’s forecasts here.

As has been the case previously, our ratings are also adjusted for travel distance and (for men’s teams only) player injuries. Our injury adjustment has been slightly improved to account for the higher or lower caliber of replacement players on different teams.

Live win probabilities

Excitement index

Elo ratings

Comments