Biden Up 15. Warren Up 7. Are Primary Polls Too Far Apart?

Last week, two polls painted two very different pictures of the state of the primary race. A CNN/SSRS poll put former Vice President Joe Biden 15 points ahead of Sen. Elizabeth Warren, 34 percent to 19 percent, while a Quinnipiac University poll released a day later found Biden trailing Warren by 7 points, 21 to 28 percent.

As both The New York Times and The Washington Post pointed out, these are pretty large discrepancies for two polls in the field at roughly the same time (the CNN poll was in the field from Oct. 17-20, the Quinnipiac poll from Oct. 17-21). But trying to figure out which poll is more accurate is kind of beside the point. After all, these are only two polls of a primary that has been polled hundreds of times, and it isn’t necessarily a problem that two pollsters arrived at different conclusions.

In fact, we expect some differences between polls in pretty much any race, even if the differences are just caused by random sampling variability (since no two random samples are exactly the same). And we almost always see some outlier polls, as long as pollsters aren’t herding. But given that the spread between these polls was so large, it naturally raises the question of whether we should expect the polls to differ this much. Are these polls just normal outliers, or are they a sign that the polls overall are too spread out?

Short answer: The spread we’re seeing is definitely outside the bounds of what you’d expect based on sampling error alone. To arrive at this conclusion, I took all the national polls since the beginning of October¹ and ran 10,000 simulations estimating how wide the spread of the polls “should” be for Biden and Warren based on the sample size of each poll.² For each simulation, I calculated the standard deviation (a measure of the spread of the polls), resulting in a distribution of what we’d expect to see as a result of sampling error alone. We would expect the actual standard deviation of the polls to fall within these intervals 95 or 99 percent of the time. And as you can see in the chart below, the spread between all the October polls is way outside the range of standard deviations for what we would expect — for both Biden and Warren.

For example, with Biden, we’d expect the standard deviation for polls to be about 2 percentage points, but it’s actually 3.5 points. It’s a similar situation for Warren — we’d expect the standard deviation to be between 2 and 3 points, but in fact it’s almost 5 points.

That suggests that it isn’t just sampling error that’s driving the differences we’re seeing — it implies there are some real methodological differences between the polls. Pollsters regularly use different approaches to polling, sampling and weighting, which can often lead to different outcomes. This is actually a good thing, since there’s a lot of uncertainty about the electorate in 2020 and it’s important that different pollsters make independent decisions about how to analyze it. This is why it’s important to control for variations in pollsters’ techniques when analyzing individual polls. Each pollster’s preferred methodology tends to make its results lean a little toward one party or to certain candidates — these leans are commonly known as “house effects,” and they can help explain some of the variation we’re seeing.

If we account for house effects,³ it turns out the spread in national polls looks a lot more like what we’d expect it to. (Though it’s still on the big side, especially for Biden.)

The standard deviation of Biden’s polls is still on the high end, but after accounting for house effects, the spread is no longer that much larger than we’d expect — it’s at 2.4 points instead of 3.5 points. It’s similar for Warren, where the standard deviation is now at 2.1 points instead of nearly 5.

So what does this tell us about those CNN and Quinnipiac polls? In short, the fact that they found such different outcomes isn’t that big a deal. As you can see in the chart below, once we control for house effects, the overall spread between polls since May isn’t actually all that large. In fact, the spread of values for both Biden and Warren fall within a range we might expect. So don’t read too much into those two polls. Turns out they’re just the kind of outliers we’d expect to see in this range of polls.

Footnotes

I excluded tracking polls to avoid double-counting respondents.
Keep in mind though, any differences could also be a result of Biden and Warren’s standing in the polls changing, i.e. losing or gaining actual support. To account for this, I used a loess curve to estimate their averages.
I calculated house effects for each candidate by taking all national polls since May, fitting a loess curve, and then taking the mean deviation of each pollster’s values from that loess curve and applied that to national polls from October. It’s a pretty rough adjustment, but it should give us some idea of the role that house effects play.

FiveThirtyEight

Biden Up 15. Warren Up 7. Are Primary Polls Too Far Apart?

Footnotes

Comments