FiveThirtyEight

Our presidential forecast, which launched today, is not the first election forecast that FiveThirtyEight has published since 2016. There was our midterms forecast in 2018, which was pretty accurate in predicting the makeup of the House and the Senate. And there was our presidential primaries model earlier this year, which was a bit of an adventure but mostly notable for being bullish (correctly) on Joe Biden and (incorrectly) on Bernie Sanders. But we’re aware that the publication of our first presidential forecast since 2016 is liable to be fraught.

We’d like to address one thing upfront, though: We think our model did a good job in 2016. Although it had Hillary Clinton favored, it gave Donald Trump around a 30 percent chance of winning on Election Day, which was considerably higher than other models, prediction markets and the conventional wisdom about the race. Moreover, the reasons the model was more bullish on Trump than other forecasts — such as detecting a potential overperformance for Trump in the Electoral College — proved to be important to the outcome.

Also, we’ve found that FiveThirtyEight’s models — including our election forecasts since they were first published in 2008 — have been well calibrated over time. Candidates who our models claim have a 30 percent chance of winning really do win their races about 30 percent of the time, for example.

So if this were an ordinary election, we’d probably just say screw it, take the 2016 version of our model, make some modest improvements, and press “go.” We’d certainly devote more attention to how the model was presented, but the underlying math behind it would be about the same.

We are not so sure that this is an ordinary election, though. Rather, it is being contested amid the most serious pandemic to hit the United States since 1918. So we’ve been doing a lot of thinking about how COVID-19 and other news developments could affect various aspects of the race, ranging from its impact on the economy to how it could alter the actual process of voting.

Put another way, while we think “ZOMG 2016!!!” is not a good reason to rethink a model that tended to be pretty cautious in the first place, we think COVID-19 might be.

What’s different from 2016

In the end, our model still isn’t that different from 2016’s, but let’s run through the list of changes. After that, we’ll provide a front-to-back description of how our model works.

First, a number of changes in the model are related to COVID-19:

Other changes fall more into the category of continual improvements we’re making to our models that aren’t directly related to COVID-19:

The rest of how our model works involves three major steps. What follows is a pretty detailed walk-through, but I’ll be more circumspect when discussing steps described at more length elsewhere, such as in our 2016 methodology guide.

Step 1: Collect, analyze and adjust polls

Our national and state polling averages, which we began publishing in June, are the first steps we take in building our election forecast. We detailed our process for constructing those polling averages when we released them, so I’ll just review the highlights here.

As we noted, the calculation of the polling averages is the first step in calculating our forecast. But they are not the same thing.

One time when this distinction is particularly relevant is following major events such as the debates and party conventions. These events sometimes produce big swings in the polls, and our polling averages are designed to be aggressive following these events and reflect the changed state of the race. However, these shifts are not necessarily long-lasting, and after a couple of weeks, the polls sometimes revert to where they were before.

Therefore, the model relies only partly on the polling average of the race after one of these events happens. For instance, say there is a debate on Oct. 1 and you’re looking at the model on, for example, Oct. 5. It will use a blend of the post-debate polling average from Oct. 5 and the pre-debate polling average from Oct. 1. After a week or two (depending on the event) though, the model will fully use the post-event polling average because it no longer necessarily expects a reversion to the mean.

In addition, our presidential model has traditionally applied a convention bounce adjustment that reflects the predictable boost in the polls that a party tends to get following its convention. Clinton surged to some of her biggest leads of the cycle following the Democratic Convention in 2016, for example. However, three factors could mitigate the convention bounce this year.

Thus, the convention bounce adjustments will be small this year. Polls conducted in the period between the Democratic convention and the Republican convention will be adjusted toward Trump by around 2 or 2.5 percentage points, depending on the precise dates of the polls. And polls in the two to three weeks after the Republican convention will be adjusted toward Biden but only very slightly so (by less than 1 full percentage point).

Step 2: Combine polls with “fundamentals,” such as demographic and economic data

As compared with other models, FiveThirtyEight’s forecast relies heavily on polls. We do, however, incorporate other data in two main ways:

Enhancing our polling averages

At the core of the modeled estimate is FiveThirtyEight’s partisan lean index, which reflects how the state voted in the past two presidential elections as compared with the national average. In our partisan lean index, 75 percent of the weight is assigned to 2016 and 25 percent to 2012. So note, for example, that Ohio (which turned much redder between 2012 and 2016) is not necessarily expected to continue to become redder. Instead, it might revert somewhat to the mean and become more purple again.

The partisan lean index also contains a number of other adjustments:

We then apply the partisan lean index in three slightly different ways to create a modeled estimate of the vote in each state.

We then combine these three estimates to create an ensemble forecast for each state. The rigid method, which is the most accurate historically, receives the majority of the weight, followed by the demographic regression and then the regional regression.

Then, we combine the ensemble forecast with a state’s polling average to create an enhanced snapshot of the current conditions in each state. The weight given to the polling average depends on the volume of polling in each state and how recently the last poll of the state was conducted. As of the forecast launch (Aug. 12), around 55 percent of the weight goes to the polling average rather than to the ensemble in the average state. However, in well-polled states toward the end of the campaign, as much as 97 or 98 percent of the weight could go toward the polling average. Conversely, states that have few polls rely mostly on the ensemble technique (and states that have no polls use the ensemble in lieu of a polling average).

Next, we combine the enhanced snapshots in each state to create a national snapshot, which is essentially our prediction of the national popular vote margin in an election held today. The national snapshot accounts for projected voter turnout in each state based on population growth since 2016, changes in how easy it is to vote since 2016, and how close the race is in that state currently — closer-polling states tend to have higher turnout. National polls are not used in the national snapshot; it’s simply a summation of the snapshots in the 50 states and Washington, D.C.

We know this is starting to get pretty involved — we’re really in the guts of the model now — but there is another important step. Our national snapshot is not the same thing as our prediction of the Election Day outcome. Instead, our prediction blends the polling-driven snapshot with a “fundamentals forecast” based on economic conditions and whether an incumbent is seeking reelection.

Polls vs. Fundamentals

I’m on the record as saying that I think presidential forecasting models based strictly on “fundamental” factors like economic conditions are overrated. Without getting too deep into the weeds, it’s easy to “p-hack” your way to glory with these models because there are so many ways to measure “the economy” but only a small sample size of elections for which we have reliable economic data. The telltale sign of these problems is that models claiming to predict past elections extremely well often produce inaccurate — or even ridiculous — answers when applied to elections in which the result is unknown ahead of time. One popular model based on second-quarter GDP, for example, implies that Biden is currently on track to win nearly 1,000 electoral votes — a bit of a problem since the maximum number theoretically achievable is 538.

At the same time, that doesn’t mean the fundamentals are of no use at all. They can provide value and gently nudge your forecast in the right direction — if you use them carefully (although they’re hard to use carefully amidst something like the pandemic).

So, since 2012, we have used an index of economic conditions in our presidential forecast. In its current incarnation, it includes six variables:

All variables are standardized so that they have roughly the same mean and standard deviation — and, therefore, have roughly equal influence on the index — for economic data since 1946. The index is then based on readings of these variables in the two years leading up to the election (e.g., from November 2018 through November 2020 for this election) but with a considerably heavier weight placed on the more recent data, in particular, the data roughly six months preceding the election. Where possible, the index is calibrated based on “vintage” economic data — that is, data as it was published in real time — rather than on data as later revised.

Although the quality of economic data is more questionable prior to the 1948 election, we have also attempted to create an approximate version of the index for elections going back to 1880 based on the data that we could find. (It’s extremely important, in our view, to expand the sample size for this sort of analysis, even if we have to rely on slightly less reliable data to do so.) Our economic index for elections dating to 1880 (see below) is expressed as a Z-score, where a score of zero reflects an average economy. And, as you can see, extremely negative economic conditions tend to predict doom for the incumbent party (as in 1932, 1980 and 2008).

The economy is a noisy predictor of presidential success

FiveThirtyEight’s economic index as of Election Day, since 1880,* where a score of zero reflects an average economy, a positive score a strong economy and a negative score a weak one

View more!

*Values prior to the 1948 election are based on more limited data and should be considered rough estimates.

But, overall, the relationship between economic conditions and the incumbent party’s performance is fairly noisy. In fact, we found that the economy explains only around 30 percent of the variation in the incumbent party’s performance, meaning that other factors explain the other 70 percent.

We do try to account for some of those “other” factors, although we’ve found they make only a modest difference. For instance, we also account for whether the president is an elected incumbent (like Trump this year or Barack Obama in 2012), an incumbent who followed the line of succession into office (like Gerald Ford in 1976) or if there is no incumbent at all (as in 2008 or 2016). We also account for polarization based on how far apart the parties are in roll call votes cast in the U.S. House. Periods of greater polarization (such as today in the U.S.) are associated with closer electoral margins and also smaller impacts of economic conditions and incumbency.

One additional complication is that the condition of the economy at any given moment prior to the election may not resemble what it eventually looks like in November, which is what our model tries to predict. Thus, the model makes a simple forecast for each of the six economic variables, which accounts for some mean-reversion, but is also based on the recent performance of the stock market (yes, it has some predictive power) and surveys of professional economists.

Although we’ll discuss this at more length in the feature that accompanies our forecast launch, the fundamentals forecast is not necessarily as bad as you might think for Trump, despite awful numbers in categories such as GDP. One of the economic components that the model considers (income) has been strong thanks to government subsidies in the form of the CARES Act, for instance, and two others (inflation and the stock market) have been reasonably favorable, too.

In addition, Trump is an elected incumbent, the economy is expected to improve between the forecast launch (Aug. 12) and November, and the polarized nature of the electorate limits the damage to him to some degree. Thus, one shouldn’t conclude that Trump is a huge underdog on the basis of the economy alone, although he’s also not a favorite to win reelection as elected incumbents typically are.

The closer to Election Day, the more our model relies on polls

Share of the weight assigned to polls and the “fundamentals,” by number of days until the election

View more!

However, our model assigns relatively little weight to the fundamentals forecast, and the weight will eventually decline to zero by Election Day. (Although the fundamentals forecast does do a good job of forecasting most recent elections, there are a lot more misses once you extend the analysis before 1948. So keep that in mind in the table, as the assigned weight is based on the entire data set.) Nonetheless, here is how much the model weights the fundamentals up until the election.

As of forecast launch in mid-August, for instance, the model assigns 77 percent of the weight to the polling-based snapshot and 23 percent of the weight to the fundamentals. In fact, the fundamentals actually help Trump at the margin (they aren’t good for him, but they’re better than his polls), so the model shifts the snapshot in each state slightly toward Trump in the forecast of the Election Day outcome. States with higher elasticity scores are shifted slightly more in this process.

Step 3: Account for uncertainty, and simulate the election thousands of times

As complicated though it may seem, everything I’ve described up until this point is, in some sense, the easy part of developing our model. There’s no doubt that Biden is comfortably ahead as of the forecast launch in mid-August, for example, and the choices one makes in using different methods to average polls or combine them with other data isn’t likely to change that conclusion.

What’s trickier is figuring out how that translates into a probability of Biden or Trump winning the election. That’s what this section is about.

Before we proceed further, one disclaimer about the scope of the model: It seeks to reflect the vote as cast on Election Day, assuming that there are reasonable efforts to allow eligible citizens to vote and to count all legal ballots, and that electors are awarded to the popular-vote winner in each state. It does not account for the possibility of extraconstitutional shenanigans by Trump or by anyone else, such as trying to prevent mail ballots from being counted.

That does not mean it’s safe to assume these rules and norms will be respected. (If we were sure they would be respected, there wouldn’t be any need for this disclaimer!) But it’s just not in the purview of the sort of statistical analysis we conduct in our model to determine the likelihood they will or won’t be respected.

We do think, however, that well-constructed polls and models can provide a useful benchmark if any attempts to manipulate the election do occur. For instance, a candidate (in a state with incomplete results because mail ballots have yet to be counted) declaring themselves the winner in a state where the model had given them an 0.4 percent chance of winning would need to be regarded with more suspicion than one where they’d had a 40 percent chance going in (although a 40 percent chance of winning is by no means a sure thing either, obviously).

With that disclaimer out of the way, here are the four types of uncertainty that the model tries to account for:

  1. National drift, or how much the overall national forecast could change between now and Election Day.
  2. National Election Day error, or how much our final forecast of the national popular vote could be off on Election Day itself.
  3. Correlated state error, which reflects errors that could occur across multiple states along geographic or regional lines — for instance, as was relevant in 2016, a systematic underperformance relative to polls for the Democratic candidate in the Midwest.
  4. State-specific error, an error relative to our forecast that affects only one state.

The first type of error, national drift, is probably the most important one as of the launch — that is, the biggest reason Biden might not win despite currently enjoying a fairly wide lead in the polls is that the race could change between now and November.

National drift is calculated as follows:

Constant x (Days Until Election)^⅓ x Uncertainty Index

That is, it is a function of the cube root of the number of days until the election times the FiveThirtyEight Uncertainty Index, which I’ll describe in a moment. (Note that the use of the cube root implies that polls do not become more accurate at a linear rate, but rather that there is a sharp increase in accuracy toward the end of an election. Put another way, August is still early as far as polling goes.)

The uncertainty index is a new feature this year, although it reflects a number of things we did previously, such as accounting for the number of undecided voters. In the spirit of our economic index, it also contains a number of measures that are historically correlated with greater (or lesser) uncertainty but are also correlated with one another in complicated ways. And under circumstances like these (not to mention the small sample size of presidential elections), we think it is better to use an equally-weighted blend of all reasonable metrics rather than picking and choosing just one or two metrics.

The components of our uncertainty index are as follows:

  1. The number of undecided voters in national polls. More undecided voters means more uncertainty.
  2. The number of undecided plus third-party voters in national polls. More third-party voters means more uncertainty.
  3. Polarization, as measured elsewhere in the model, is based on how far apart the parties are in roll call votes cast in the U.S. House. More polarization means less uncertainty since there are fewer swing voters.
  4. The volatility of the national polling average. Volatility tends to predict itself, so a stable polling average tends to remain stable.
  5. The overall volume of national polling. More polling means less uncertainty.
  6. The magnitude of the difference between the polling-based national snapshot and the fundamentals forecast. A wider gap means more uncertainty.
  7. The standard deviation of the component variables used in the FiveThirtyEight economic index. More economic volatility means more overall uncertainty in the forecast.
  8. The volume of major news, as measured by the number of full-width New York Times headlines in the past 500 days, with more recent days weighted more heavily. More news means more uncertainty.

In 2020, measures No. 1 through 5 all imply below-average uncertainty. There aren’t many undecided voters, there are no major third-party candidates, polarization has been high and polls have been stable. Measure No. 6 suggests average uncertainty. But metrics No. 7 and 8 imply extremely high uncertainty; there has been a ton of news related to COVID-19 and other major stories, like the protests advocating for police reform in response to the death of George Floyd — not to mention the impeachment trial of Trump earlier this year. Likewise, there has been as much volatility in economic data as at any time since the Great Depression.

On the one hand, the sheer number of uncertainties unique to 2020 indicate the possibility of a volatile election, but on the other hand, there are also a number of measures that signal lower uncertainty, like a very stable polling average. So when we calculate the overall degree of uncertainty for 2020, our model’s best guess is that it is about average relative to elections since 1972. That average, of course, includes a number of volatile elections such as 1980, 1988 and 1992, where there were huge swings in the polls over the final few months of the campaign, along with elections such as 2004 and 2012 where polls were pretty stable. As voters consume even more economic- and pandemic-related news — and then experience events like the conventions and the debates — it’s not yet clear whether the polls will remain stable or begin to swing around more.

It’s also not entirely clear how this might all translate into the national Election Day error — that is, how far off the mark our final polling averages are — either. In calculating Election Day error, we use a different version of the uncertainty index that de-emphasizes components No. 6, 7 and 8, since those components pertain mostly to how much we expect the polls to change between now and the election, rather than the possibility of an Election Day misfire.

Still, our approach to calculating Election Day error is fairly conservative. In order to have a larger sample size, the calculation is based on the error in final polls in elections since 1936, rather than solely on more recent elections. While polls weren’t as far off the mark in 2016 as is generally reputed (national polls were fairly accurate, in fact), it’s also not clear that the extremely precise polls in the final weeks of 2004, 2008 and 2012 will be easy to replicate given the challenges in polling today. Given the small sample sizes, we also use a fat-tailed distribution for many of the error components, including the national Election Day error, to reflect the small — but not zero — possibility of a larger error than what we’ve seen historically.

There could also be some challenges related to polling during COVID-19. In primary elections conducted during the pandemic, for instance, turnout was hard to predict. In some ways, the pandemic makes voting easier (expanded options to vote by mail in many states), but it also makes it harder in other ways (it’s difficult to socially distance if you must vote in person).

This is a rough estimate because there are a lot of confounding variables — including the end of the competitive portion of the Democratic presidential primary — but we estimate that the variability in turnout was about 50 percent higher in primary elections conducted after the pandemic began in the U.S. than those conducted beforehand. Empirically, we know that states that experience a sharp change in turnout from one cycle to the next are harder to forecast, too. So we estimate that a 50 percent increase in error when predicting turnout will result in a 20 percent increase in error when predicting the share of the vote each party receives.

Therefore, we increase national Election Day error, correlated state error and state-specific error by 20 percent relative to their usual values because of how the coronavirus could affect turnout and the process of voting. Note that this still won’t be enough to cover extraordinary developments such as mail ballots being impounded. But it should help to reflect some of the additional challenges in polling and holding an election amidst a pandemic.

When it comes to simulating the election — we’re running 40,000 simulations each time the model is updated — the model first picks two random numbers to reflect national drift (how much the national forecast could change) and national Election Day error (how off our final forecast of the national popular vote could be) that are applied more or less uniformly to all states. However, even if you somehow magically knew what the final national popular vote would be, there would still be additional error at the state level. A uniform national swing would not have been enough to cost Clinton the Electoral College in 2016, for example. But underperformance relative to the polls concentrated in the Midwestern swing states did.

In fact, we estimate that at the end of the campaign, most of the error associated with state polling is likely to be correlated with errors in other states. That is to say, it is improbable that there would be a major polling error in Michigan that wouldn’t also be reflected in similar states such as Wisconsin and Ohio.

Therefore, to calculate correlated polling error, the model creates random permutations based on different demographic and geographic characteristics. In one simulation, for instance, Trump would do surprisingly well with Hispanic voters and thus overperform in states with large numbers of Hispanics. In another simulation, Biden would overperform his polls in states with large numbers of Catholics. The variables used in the simulations are as follows:

One mathematical property of correlated polling errors is that states with demographics that resemble those of the country as a whole tend to have less polling error than those that don’t. Underestimating Biden’s standing among Mormons wouldn’t cause too many problems in a national poll, or in a poll of Florida, for example. But it could lead to a huge polling error in Utah. Put another way, states that are outliers based on some combination of the variables listed above tend to be harder to predict.

Finally, the model randomly applies some residual, state-specific error in each state. This tends to be relatively small, and is primarily a function of the volume of polling in each state, especially in states that have had no polling at all. If you’re wondering why Trump’s chances are higher than you might expect in Oregon, for example, it’s partly because there have been no polls there as of forecast launch.

Odds and ends

Whew — that’s pretty much it! But a few random bullet points that don’t fit neatly into the categories above.

Got any other questions or see anything that looks wrong? Please drop us a line.


Filed under