The Statistical State of the Presidential Race

With fewer than 45 days left in the presidential campaign, it’s no longer a cliché to say that every week counts. And there are a few polling-related themes we’ll be watching especially closely this week.

This is probably about the last week, for instance, in which Mitt Romney can reasonably hope that President Obama’s numbers will deteriorate organically because of a convention bounce. That is not to say that Mr. Obama’s standing could not decline later on in the race, for any number of reasons. But if they do, it will probably need to be forced by Mr. Romney’s campaign, or by developments in the news cycle, not the mere loss of post-convention momentum.

We’ll also be looking to see if there is a greater consensus in the polls this week. In general, last week’s numbers started out a bit underwhelming for Mr. Obama — suggesting that the momentum from his convention was eroding — but then picked up strength as the week wore on.

Still, there were splits among the tracking polls and among other national surveys; between state polls that called cellphones and those which did not; and among pollsters who came to a wide variety of conclusions about whose supporters were more enthusiastic and more likely to turn out.

But before we get lost in the weeds, let’s consider a more basic question. What did the polling look like at this stage in past elections, and how did it compare against the actual results?

Our polling database contains surveys going back to 1936. The data is quite thin (essentially just the Gallup national poll and nothing else) through about 1968, but it’s nevertheless worth a look.

In the table below, I’ve averaged the polls that were conducted 40 to 50 days before the election in each year — the time period that we find ourselves in now. (In years when there were no polls in this precise time window, I used the nearest available survey.)

The table considers the race from the standpoint of the incumbent party (designated with the color purple) and the challenging party (wearing the orange jerseys), without worrying about whether they were Democrats or Republicans. Mr. Obama’s position, for instance, is probably more analogous to that of the Republican incumbent George W. Bush in 2004 than it is to the candidate from his own party that year, John Kerry.

This is an awful lot of data, but there are several reasonably clear themes.

First, the polling by this time in the cycle has been reasonably good, especially when it comes to calling the winners and losers in the race. Of the 19 candidates who led in the polls at this stage since 1936, 18 won the popular vote (Thomas E. Dewey in 1948 is the exception), and 17 won the Electoral College (Al Gore lost it in 2000, along with Mr. Dewey).

Of course, if Mr. Obama led in the race by 30 percentage points — as Lyndon B. Johnson did in 1964 — there wouldn’t be much need for such detailed analysis, and FiveThirtyEight might be free to blog about the baseball playoffs.

If you eliminate the candidates with double-digit leads, the front-runner’s record is eight Electoral College wins in 10 tries, or a batting average of 80 percent.

This a simple method — to the point of being crude. But it’s interesting, nevertheless, that the 80 percent figure corresponds quite well with the FiveThirtyEight forecast, which gave Mr. Obama a 78 percent chance of winning as of Sunday night, and with the odds on offer by bookmakers, many of whom list Mr. Obama as about a 4-to-1 favorite.

The second theme is one that we’ve brought up before. There has not been any tendency, at least at this stage of the race, for the contest to break toward the challenging candidate.

Instead, it’s actually the incumbent-party candidate who has gained ground on average since 1936. On average, the incumbent candidate added 4.6 percentage points between the late September polls and his actual Election Day result, whereas the challenger gained 2.5 percentage points.

You can slice the data in slightly different ways if you like: by looking at only true incumbent presidents, for instance, as opposed to those who represented the incumbent party after the sitting president retired — or furthermore, you can restrict the sample to elected incumbents, which would exclude cases like Gerald R. Ford in 1976. But it gets you to more or less the same answer.

It is also important to observe, however, that the challenging party’s candidate has gained more ground than the incumbent in each of the past four election cycles (from 1996 through 2008). Statistically speaking, this streak does not tell us all that much (the incumbent party closed well in each year from 1988 through 1992). But perhaps this reflects the fact that the conventions are being held later and later, meaning that the incumbent-party candidate, who holds his convention last, could still be in the midst of a modest convention bounce at this stage of the race. For that reason, I think we’ll need to wait until at least the end of the week to see if Mr. Obama’s numbers hold.

But the point is not to argue for the idea that Mr. Obama is likely to gain ground so much as against the notion that Mr. Romney will necessarily have a tail wind. In 14 of the 19 elections since 1936, both the incumbent and the challenger added at least some points to their standing relative to each candidate’s late September polls.

A corollary to this is that the incumbent (or the challenger, for that matter) does not need to be at 50 percent of the vote to be a clear favorite to win: the eventual winner will probably pick up at least some undecided voters, and at least a few votes will go to third-party candidates. Mr. Obama’s current number in the polls — about 48 or 49 percent on average in national surveys — is very similar to those of George W. Bush in 2004, George H.W. Bush in 1988, and Franklin D. Roosevelt in 1944, all of whom won, some of them easily.

Harry S. Truman won the 1948 election despite being at just 39 percent at this point in the polls. His opponent, Mr. Dewey, achieved the highest standing in the late September polls (47 percent) of any candidate (incumbent or challenger) who failed to win the election, although John F. Kennedy came quite close to losing in 1960 despite being at 49 percent in the Gallup poll in September.

To the extent there’s a useful rule of thumb about a candidate achieving 50 percent in the polls, it is this: a candidate who reaches 50 percent of the vote late in the race is almost certain to win. Below that threshold, there are fewer guarantees. But a candidate (incumbent or challenger) at 48 or 49 percent of the vote will normally be a clear favorite.

Nonetheless, another theme: although Mr. Obama’s raw vote share looks reasonably strong, Mr. Obama’s margin over Mr. Romney is not that impressive for an elected incumbent. On average, elected incumbents have led by 7.7 percentage points that this stage of the race — larger than Mr. Obama’s advantage, which is in the range of four points instead.

However, this also helps to explain why Mr. Obama is leading in the race despite a mediocre economy. If an elected incumbent wins by a margin in the high single digits in an the average year, that gives him quite a bit of slack if conditions are below-average, but not terrible. The economy is bad, but perhaps not quite bad enough to oust an elected incumbent who otherwise has a fair number of advantages.

The next point is that large changes can occur late in the race, or at least large errors in the polling. There were four years (1936, 1948, 1968 and 1972) in which the actual election result diverged by at least 10 points from the late September polls, and several other years (like 1980) when there was a shift in the mid-to-high single digits. Of these years, only 1948 reversed the winner — but there were also a lot of close calls, like a near-comeback by Hubert H. Humphrey in 1968, who went from 15 points down to losing to Richard M. Nixon by less than a full percentage point.

A general rule in statistical analysis is that close calls really ought to count, at least for partial credit. Several election years — certainly 1960, 1968 and 2000, and arguably 1976 and 2004 — were close enough that their results could have been altered by essentially random factors.

But these late changes in the polls seem to be becoming less frequent. Since 1972, the average change between the late September polls and the election result is 4.9 percentage points in one direction or another, versus an average error of 7.1 percentage points between 1936 and 1968. And the shifts have been smaller still, 3.7 percentage points on average, in the five elections since 1992.

Does this reflect improved (or at least more abundant) polling, changing behavior in the electorate, or both? Presumably a little of both. Gallup, for instance, had Mr. Dewey defeating Mr. Truman in 1948, but if there had been a dozen pollsters in the field back then, would they all have shown that same result? (Consider that, until Sunday, Gallup’s national tracking poll showed a tied race — whereas virtually every other state and national pollster has produced numbers consistent with Mr. Obama holding at least a small lead.)

But there should also be little doubt that Americans are tuning into the presidential race earlier, and that they are becoming more partisan, two trends that lock them into their candidate choices sooner and reduce late-stage volatility. And an increasing number of Americans are taking advantage of early voting — which is already under way in some states — meaning that they cast their ballot sooner in an entirely literal sense.

Next, and related, there are few undecided voters this year. On average among national polls, about 7 percent of voters either say they are undecided, or that they will vote for a third-party candidate — the same percentage as in 2004, when voters committed early to Mr. Bush or Mr. Kerry. The figures are slightly lower than at a comparable point in 2008, and considerably lower than in 2000.

By the way, I am intentionally lumping undecided voters and potential votes for third-party candidates together. Some voters who are not thrilled with the major-party choices may name a third-party candidate when a pollster gives them the option, but then grudgingly vote Democrat or Republican for fear of wasting their votes otherwise. For this reason, polls generally overstate the standing of third-party candidates, and for forecasting purposes it may be proper to treat ostensible third-party voters as de facto undecideds.

The exception is when a third-party candidate is potentially more viable, like H. Ross Perot in 1992. But just as a greater number of undecided voters contributes volatility to the outcome, so does the presence of strong third-party choices. In those years, there are three vectors along which votes can move — between the Democrat and the independent, the Democrat and the Republican, and the independent and the Republican — as opposed to just one. Many of the years associated with the largest late-stage errors in the polling, like 1968 and 1980, were also associated with third-party candidates.

Thus, although a shift of several percentage points in Mr. Romney’s favor is far from impossible, or even all that unlikely, this also looks like a year in which volatility in the polls might be lower than average. Third-party candidates are playing only a minor role this year, there are few undecideds and the late-stage movement in the polls has been on a secular downward trend over the past two decades.

Furthermore, there tends to be less movement in the polls in reasonably close elections than in blowouts, when the trailing candidate can sometimes receive a dead-cat bounce, or when the front-runner’s advantage grows from large to larger if the trailing candidate’s supporters are too despondent to turn out, as may have been the case for Walter Mondale’s Democrats in 1984.

And indeed, volatility has been low throughout the campaign. Just as in the stock market, past volatility seems to predict future volatility in the polls.

So this is why, despite the importance of the big picture, we will also need to sweat the small stuff this week. It seems plausible that by seven days from now, the consensus of data could point toward anything from Mr. Obama being a two-point favorite (about where the race was before the conventions) to being as much as six points ahead (as some of his stronger state polls seem to imply). Likewise, he could be at anywhere from about 47 percent of the vote (if his numbers recede from a convention bounce) to 50 percent (if his bounce holds and he inches forward as undecided voters commit.)

This makes an enormous amount of difference. Based on the way that our forecast model calculates it, a candidate ahead by two percentage points at this stage would be about a two-to-one favorite to win — odds that Mr. Romney might have to accept at this stage, improving his position enough to make further gains later. But a candidate ahead by six points would have around a 90 percent
chance of victory.

A version of this article appears in print on 09/25/2012, on page A10 of the NewYork edition with the headline: A Solid Record for Polls As Elections Near.

A version of this article appears in print on 09/25/2012, on page A10 of the NewYork edition with the headline: A Solid Record for Polls As Elections Near.

Comments