Notes on Poll-Watching, as Shift Toward General Election Season Begins

In a couple of weeks, we’ll be starting our FiveThirtyEight forecasts for the general election. But there’s still a fair amount of work to get those ready, and we seem to be seeing an increasing number of head-to-head polls pairing President Obama against Mitt Romney in the meantime. Thus, as we transition into general election season, I wanted to articulate a few quick reminders about the philosophy I take toward looking at these polls and other important pieces of general-election related data.

1. Be patient. Many of the poll-watching habits you learned for the primaries you will need to unlearn for the general election.

In the primaries, it is often worth paying a tremendous amount of attention to how recently a poll was conducted. Because voter opinion shifts rapidly in primaries, a poll that is even two or three days old might have substantially less information value than one that was released today.

That just isn’t true in the general election, when there are fewer swing voters, the candidates are better known, and voter preferences are more rigid. Instead, polls have a much stronger tendency to revert to the mean, and what is perceived to be “momentum” is often just statistical noise. In October, it might be worth sweating just a little bit if there seems to be a two- or three-percentage point shift against your preferred candidate. Right now, it probably isn’t; a poll released on April 20 isn’t going to be much better in the long-run than one released on April 10.

2. Take the poll average. This ought to be obvious, but you should generally be looking for a trend to show up in several different polls from several different polling firms before you start to view it as newsworthy. Again, this differs a little bit from the primaries because there is less of a premium on recency in the general election; you’re usually better off waiting for another (or better yet two or three more) data points.

The easiest way to do this is to take an average of recent polls, as sites like Real Clear Politics do. The technique that FiveThirtyEight uses is a little fancier, taking a weighted average of polls based on their past accuracy as well as their methodological standards. However, the gains from doing this are modest as compared with the simple average method. In contrast, taking even that simple polling average provides for considerable gains in accuracy over any one poll taken alone.

3. Discount, but don’t throw out, “outlier” polls. If the polling average does better than an individual poll, a logical corollary is that a poll which is far outside the consensus is more likely to do a bad job of predicting the outcome. Put differently, a poll that looks like an outlier fairly often turns out to be one. If there are five polls that show Candidate X ahead by a clear margin in a state, and a sixth comes out showing Candidate Y in front instead, you’re really misleading yourself and misinforming your readers if you write a 800-word blog post touting the predictive powers of the new poll and ignoring what the consensus says.

With that said, I also do not think that you should throw out the outlier poll. Unless you have an extremely strong reason to doubt the provenance of a poll, it’s usually still worth including it in your average.

Keep in mind that even accurate and honest pollsters will have outliers once in a while because of the randomness inherent in sampling. It will usually not be productive to claim that the sample is “rigged” or “biased.” Throw it in the average, and move on.

4. Pay attention to “house effects.” The counter to this is that while I am not a fan of litigating any one poll to death, it is certainly worth paying attention to systematic trends in a polling firm’s body of work. If a poll consistently shows a house effect — that is, a partisan lean toward either Democratic or Republican candidates — that’s something you can, and probably should, correct for.

Pollsters make different methodological choices that can produce these house effects. Usually (although perhaps not always) these choices are made in good faith. The point is that these differences often become fairly predictable. If a poll exhibits a consistent house effect of three percentage points favoring Mr. Obama, for instance, our model basically subtracts that difference right back out.

5. Pay attention to likely voters versus registered voters. It is worth looking at whether the poll is conducted among registered voters, likely voters or all adults.

In the past eight presidential election cycles or so, the Republican candidate has done a net of about two percentage points better on average in likely voter polls than in registered voter polls. That is, if the Republican had a four percentage point lead in a poll of registered voters, it might be inferred that he had a six percentage point lead among likely voters instead. This historical trend is probably not simply a statistical fluke, and instead reflects the fact that the demographic groups that tend to vote Republican — for instance, older and wealthier voters — tend to be more likely to vote as well.

However, this advantage can vary a bit from election to election. Sometimes, there is almost no difference between registered and likely votes. In other cases, it can give the Republican candidate an advantage of five percentage points or even more.

The best way to gauge the gap is to look for cases in which the same polling firm has published both registered and likely voter numbers from the same survey. That is, if a Pew poll has Mr. Obama up two among registered voters, but Mr. Romney ahead by three among likely voters, that could be informative.

Be more cautious, however, until — and unless — you start to see these apples-to-apples comparisons. The split between registered voters and likely voters is sometimes smaller than a poll’s house effect. The firm Rasmussen Reports, for instance, shows a very considerable (and Republican-leaning) house effect even when it takes polls of all adults rather than likely voters.

6. Keep paying attention to Mr. Obama’s approval ratings. In the early stages of general election campaigns, a president’s approval ratings have often been at least as accurate a guide to his eventual performance as the head-to-head numbers. Thus, for at least the next couple of months, I would pay as much attention to Mr. Obama’s approval ratings as his head-to-head polls against Mr. Romney.

It is probably slightly better to look at Mr. Obama’s net approval rating — his approval less his disapproval — than the approval rating alone.

7. Look at a robust array of economic indicators. It is also still very worth looking at economic data. But do it intelligently. The American economy is a hard thing to measure, and there are not any magic bullets when it comes to predicting the vote.

Common sense indicators like gross domestic product and job growth during the election year have historically explained about 30 percent or 40 percent of election results (but not more than that). Models that claim to do better than that based on economic factors alone are mistaking noisy data for a signal and have a very poor track record at prediction.

It is probably better to look toward the consensus of economic indicators rather than any one data series. If you’re obsessive about this stuff, some financial sites have calendars that publish new economic data points the moment they come off the ticker. Otherwise, focus on major indicators like gross domestic product, jobs, inflation, and measures of wages and income, perhaps along with consumer confidence.

8. Be careful with economic forecasts. Past economic performance should theoretically be incorporated fairly quickly into a president’s approval ratings and his head-to-head polls. But future shifts in economic performance could potentially send the numbers in another direction.

Unfortunately, these shifts are hard to anticipate, and the track record of macroeconomic forecasts is quite bad. Historically, the forecasts issued by economists have had essentially no ability to predict a recession more than six months in advance, and have large margins of error even a month or two out.

If you do want to look at economic forecasts, it is probably better to look at consensus surveys, like those published monthly by The Wall Street Journal, rather than any one individual economist’s forecasts. Historically, the consensus forecast has been about 20 percent more accurate than the typical individual’s forecast.

9. State poll data is useful but very noisy. With Mr. Obama’s running for re-election, and Mr. Romney’s being a fairly orthodox Republican candidate, the swing states this year are very likely to be about the same as the swing states in 2008.

Because of the importance of the Electoral College, of course, it will be worth tracking to see whether there are any shifts. Can Mr. Obama put Arizona in play without Senator John McCain on the ballot? Can Mr. Romney turn New Hampshire from a blue-leaning state into a red-leaning one?

But we’re getting, at best, one poll every two or three weeks in major swing states now, and some important states have hardly been polled at all. Most of this speculation, therefore, is premature.

I do hope that by the time we release our forecasts in a few weeks, the volume of state polling will increase at least somewhat, but it will probably be until at least the late summer until the data is robust enough to allow for deeply meaningful conclusions about whether one of the candidates has a systemic advantage in the Electoral College. Keep in mind that a candidate who carries the national popular vote by more than about three percentage points is all but certain mathematically to also win the Electoral College.

10. Don’t abuse demographic cross-tabs. The sample sizes on subpopulations in a poll — like Hispanics, young voters or evangelical Christians — are much smaller than for all voters as a whole and therefore contain much larger margins of error. For instance, a poll that surveys 600 respondents, of whom 75 are Hispanic, has a margin of error of about plus or minus 11 points on that subgroup. And that is under ideal circumstances; in practice, some subgroups (including Hispanics) are harder to get on the phone than others.

It’s easy to write the “Candidate X has problems among Group Y” stories, but very often they are just weaving narratives from statistical noise. Unless the demographic patterns are clear and consistent across several different polls, these stories are usually worth ignoring.

11. Read the polls in the context of the news. Polls don’t just shift on their own; they change because people are reacting to changes in their circumstances and to different news events.

Political reporters have beats and deadlines and need to turn stories around every day. But most of the day-to-day squabbles that the campaigns have don’t matter to most voters. If there is a shift in the polls, it is much more likely to be real rather than illusory if it follows something like an Israeli air strike on Iran or a stock market crash than something like this or this.

The other caution is that even when major news events do shift the polls, they sometimes have a half-life with the effects fading over time. These events may produce long-term and permanent effects on how voters see the candidates, but they often overshoot the mark in the close term. Recent examples include the uptrend in Mr. Obama’s approval ratings after the death of Osama bin Laden or the downtrend following the debt ceiling negotiations, both of which persisted for some weeks but then faded.

12. Don’t over-learn the lessons of history. A final and more general point is that there have been only 16 presidential elections since World War II. That simply isn’t a lot of data, and overly specific conclusions from them, like “no recent president has been re-elected with an unemployment rate over 8.0 percent” or “no recent incumbent has lost when he did not face a primary challenge,” are often not very meaningful in practice and will generally not carry much predictive weight.

Comments