Based on what most of us would have thought possible a year or two ago, the election of Donald Trump was one of the most shocking events in American political history. But it shouldn’t have been that much of a surprise based on the polls — at least if you were reading FiveThirtyEight. Given the historical accuracy of polling and where each candidate’s support was distributed, the polls showed a race that was both fairly close and highly uncertain.
This isn’t just a case of hindsight bias. It’s tricky to decide what tone to take in an article like this one — after all, we had Hillary Clinton favored. But one of the reasons to build a model — perhaps the most important reason — is to measure uncertainty and to account for risk. If polling were perfect, you wouldn’t need to do this. And we took weeks of abuse from people who thought we overrated Trump’s chances. For most of the presidential campaign, FiveThirtyEight’s forecast gave Trump much better odds than other polling-based models. Our final forecast, issued early Tuesday evening, had Trump with a 29 percent chance of winning the Electoral College.1 By comparison, other models tracked by The New York Times put Trump’s odds at: 15 percent, 8 percent, 2 percent and less than 1 percent. And betting markets put Trump’s chances at just 18 percent at midnight on Tuesday, when Dixville Notch, New Hampshire, cast its votes.
So why did our model — using basically the same data as everyone else — show such a different result? We’ve covered this question before, but it’s interesting to do so in light of the actual election results. We think the outcome — and particularly the fact that Trump won the Electoral College while losing the popular vote — validates important features of our approach. More importantly, it helps to explain why Trump won the presidency.
A small, systematic polling error made a big difference
Clinton was leading in the vast majority of national polls, and in polls of enough states to get her to 270 electoral votes, although her position in New Hampshire was tenuous in the waning days of the campaign. So there wasn’t any reasonable way to construct a polling-based model that showed Trump ahead. Even the Trump campaign itself put their candidate’s chances at 30 percent, right about where FiveThirtyEight had him.
But people mistake having a large volume of polling data for eliminating uncertainty. It doesn’t work that way. Yes, having more polls helps to a degree, by reducing sampling error and by providing for a mix of reasonable methodologies. Therefore, it’s better to be ahead in two polls than ahead in one poll, and in 10 polls than in two polls. Before long, however, you start to encounter diminishing returns. Polls tend to replicate one another’s mistakes: If a particular type of demographic subgroup is hard to reach on the phone, for instance, the polls may try different workarounds but they’re all likely to have problems of some kind or another. The cacophony of headlines about how “CLINTON LEADS IN POLL” neglected the fact that these leads were often quite small and that if one poll missed, the others potentially would also. As I pointed out on Wednesday, if Clinton had done only 2 percentage points better across the board, she would have received 307 electoral votes and the polls would have “called” 49 of 50 states correctly.
FiveThirtyEight’s probabilities are based on the accuracy of polling averages in presidential elections dating back to 1972. That is, our models are based on how accurate polls have or haven’t been historically, instead of making idealized assumptions about them. For instance, national polling averages in the final week of the campaign have missed the actual outcome by an average of about 2 percentage points. That’s larger than you’d expect from sampling error alone2 and suggests that the polls sometimes suffer from systematic error: Almost all of the them are off in the same direction.
Historically, meanwhile, the error is larger in state polls than in national polls. That’s because there’s less of an opportunity for polling errors to cancel each other out. Suppose, for example, that the polls underestimate Clinton’s performance with Hispanic voters, but overestimate it among white voters without college degrees. In national polls, the overall effect might be relatively neutral. But the state polls will err in opposite directions, overestimating Clinton’s performance in states with lots of noncollege white voters but underestimating it in Hispanic-heavy states.
That’s something like what happened this year. In fact, the error in national polls wasn’t any worse than usual. Clinton was ahead by 3 to 4 percentage points in the final national polls. She already leads in the popular vote, and that lead will expand as mail ballots are counted from California and Washington, probably until she leads in the popular vote by 1 to 2 percentage points overall. That will mean only about a 2-point miss for the national polls. They may easily wind up being more accurate than in 2012, when they missed by 2.7 percentage points.
But what about the state polls? They were all over the place. Clinton actually overperformed FiveThirtyEight’s adjusted polling average in 11 states and the District of Columbia. The problem is that these states were California, Hawaii, Illinois, Massachusetts, Nevada, New Jersey, New York, New Mexico, Oregon, Rhode Island and Washington. Since all of these states except for Nevada and perhaps New Mexico were already solidly blue, that only helped Clinton to run up the popular vote margin in states whose electoral votes she was already assured of. That’s especially true of Calfornia, where Clinton both beat her polls by more than 5 percentage points and substantially improved on Barack Obama’s performance from 2012.
FiveThirtyEight: Nate Silver discusses the method to our forecast
Clinton collapsed in the Midwest, destroying her Electoral College chances
Conversely, Clinton underperformed her polls significantly throughout the Midwest and the Rust Belt: by 4 points in Michigan and Minnesota, by 5 points in Pennsylvania and by 6 points in Iowa, Ohio and Wisconsin. Clinton just narrowly held on to win Minnesota, and she hadn’t been favored in Iowa or Ohio to begin with. But Michigan, Wisconsin and Pennsylvania flipped to Trump and cost her the election. (Otherwise, she’d have wound up with 278 electoral votes.)
FiveThirtyEight’s models consider possibilities such as these. In addition to a systematic national polling error, we also simulate potential errors across regional or demographic lines — for instance, Clinton might underperform in the Midwest in one simulation, or there might be a huge surge of support among white evangelicals for Trump in another simulation. These simulations test how robust a candidate’s lead is to various types of polling errors.
In fact, Clinton’s Electoral College leads weren’t very robust. And the single biggest reason was because of her relatively weak polling in the Midwest, especially as compared to President Obama four years ago. Because the outcomes in these Midwestern states were highly correlated, having problems in any one of them would mean that Clinton was probably having trouble in the others, as well.
There just aren’t enough electoral votes in swing states elsewhere in the country for a Democrat to survive a Midwestern collapse. Michigan, Wisconsin, Minnesota, Ohio, Iowa and Pennsylvania (which is not a part of the geographic Midwest, but which functions like a Midwestern state politically) together have 80 electoral votes. Lose all of those states, and a Democrat would still lose even with Florida, North Carolina, Colorado, Nevada, Virginia in her column.
Eventually, Democrats will find new battleground states. Clinton came closer to winning Arizona and Georgia than she did to winning Ohio, and closer to winning Texas than she did to winning Iowa. By 2024 or 2028, these may all have become purple states. In the interim, the Electoral College could get awkward for Democrats, with states such as Pennsylvania having gone from bluish to reddish, and states like Arizona and Georgia becoming more purple, but taking their time to get there. The Electoral College was already pretty awkward for them this year, obviously, which is why our model showed more than a 10 percent chance of a popular vote/Electoral College split in Trump’s favor.
Undecideds and late deciders broke for Trump
The single most important reason that our model gave Trump a better chance than others is because of our assumption that polling errors are correlated. No matter how many polls you have in a state, it’s often the case that all or most of them miss in the same direction. Furthermore, if the polls miss in one direction in one state, they often also miss in the same direction in other states, especially if those states are similar demographically.
There were some other factors too, however, that helped Trump’s chances in our forecast. One is that our model considers the number of undecided and third-party voters when evaluating the uncertainty in the race. There were far more of these voters than in recent, past elections: About 12 percent of the electorate wasn’t committed to either Trump or Clinton in final national polls, as compared with just 3 percent in 2012. That’s a big part of the reason our model was quite confident about Obama’s chances in 2012, but not all that confident about Clinton’s chances this year.
Indeed, late-deciding voters broke toward Trump, according to exit polls of most swing states. Or at least, that was the case in states where Trump outperformed his polls, such as in Pennsylvania and Wisconsin. It wasn’t as true in states such as Nevada and Virginia, where Clinton matched or exceeded her polls:
|VOTE SHARE OF THOSE WHO DECIDED THE WEEK BEFORE THE ELECTION|
Pollsters simply can’t do much about voters who make up their minds only after the survey is completed. (And making inferences is a guessing game: It’s sometimes said that undecideds tend to break to the challenger, but the empirical evidence on this is mixed. Obama won late-deciding votes in 2012, for example.) But modelers can do something about it, by allowing for more uncertainty in the forecast when there are more undecideds. If only 3 percent of the electorate is undecided, then winning undecideds 3-2 — as Trump did in several swing states — will shift the overall outcome by less than 1 percentage point. But if 12 percent of the electorate is undecided, winning them by that ratio will produce a net swing of 2 to 3 points toward a candidate, potentially letting him overtake the front-runner.
Trump was also gaining ground on Clinton over the final two weeks of the campaign (although with a slight rebound for Clinton in the last 48 hours or so). There’s a lot of room to debate how much a model should chase down a polling swing when one occurs in, say, July or August. FiveThirtyEight’s polls-only model is notoriously aggressive about trying to reflect the public polls as they appear on the day of the forecast, even if it makes for a swingier result. Our polls-plus model, and most other forecasts, are more conservative and have various techniques for discounting polling swings. But as Election Day approaches, we think models ought to be fairly aggressive about detecting polling movement. (Polls-plus stops discounting polling swings by the end of the campaign). If there’s a late-breaking news event, such as FBI Director James Comey’s letter to Congress on Oct. 28, even a “temporary” effect may still weigh on voters at the time they cast their ballot.
A failure of conventional wisdom more than a failure of polling
It’s one thing to criticize pollsters — or polling-based forecasts — if your personal prediction came closer to getting the outcome right. But I’d assert that most mainstream journalists would have given Trump much lower odds than the 30 percent chance that FiveThirtyEight gave him, and that most campaign coverage was premised on the idea that Clinton was all but certain to become the next president. Both reporters and pundits criticized FiveThirtyEight and other polling sites for not accounting for early voting data, for example, on the idea that it portended good news for Clinton that our model ignored. As we’ve discovered in the past, however, it’s hard to make inferences from early voting and attempts to do have a fairly bad track record — as they did this year.3
We also received a lot of criticism from Democratic partisans in the closing weeks of the campaign — more than we did from Trump supporters — because they thought we didn’t have Clinton as a heavy enough favorite. That’s unusual. We’ve forecasted enough races over the years to have taken criticism from almost every side. But in the past, it’s always been the trailing candidate’s supporters who gave us more grief.
In this respect, there’s another parallel between Trump’s victory on Tuesday, and the United Kingdom’s vote to leave the European Union in June. Brexit polls showed the race almost tied, with “Remain” leading by perhaps half a percentage point. In fact, “Leave” won by about 4 percentage points. The polls took a lot of criticism even though they’d shown “Leave” at almost even-money, whereas betting markets — and the conventional wisdom from London-based reporters — had “Remain” heavily favored to prevail. Londoners may have interpreted the data in selective ways because of the “unthinkability” of Britain’s leaving the EU to people in their social circles.
Tuesday’s results were similar. We strongly disagree with the idea that there was a massive polling error. Instead, there was a modest polling error, well in line with historical polling errors, but even a modest error was enough to provide for plenty of paths to victory for Trump. We think people should have been better prepared for it. There was widespread complacency about Clinton’s chances in a way that wasn’t justified by a careful analysis of the data and the uncertainties surrounding it.