Polling Is Getting Harder, But It’s A Vital Check On Power

There was plenty of apprehension at the annual conference of the American Association for Public Opinion Research, which I attended last month,¹ about the state of the polling industry.

The problem is simple but daunting. The foundation of opinion research has historically been the ability to draw a random sample of the population. That’s become much harder to do, at least in the United States. Response rates to telephone surveys have been declining for years and are often in the single digits, even for the highest-quality polls. The relatively few people who respond to polls may not be representative of the majority who don’t. Last week, the Federal Communications Commission proposed new guidelines that could make telephone polling even harder by enabling phone companies to block calls placed by automated dialers, a tool used in almost all surveys.²

What about Internet-based surveys? They’ll almost certainly be a big part of polling’s future. But there’s not a lot of agreement on the best practices for online surveys. It’s fundamentally challenging to “ping” a random voter on the Internet in the same way that you might by giving her an unsolicited call on her phone.³ Many pollsters that do Internet surveys eschew the concept of the random sample, instead recruiting panels that they claim are representative of the population.

If you’ve been reading FiveThirtyEight, you know that we’ve been writing about these challenges for a long time (as have many of our friends elsewhere). Pollsters have long been worried about these issues too, of course. But until recently, the problems seemed to have relatively little impact on the accuracy of polling. In fact, the polls had a series of American election cycles — especially 2004, 2008 and 2010 — in which they were quite accurate. And while the polls weren’t great in 2012, missing low on President Obama’s performance and that of Democrats generally, they still pointed mostly in the right direction.

But lately, there have been a series of relatively poor outcomes. Polls of the U.S. midterms last year badly underestimated the Republican vote. And there have been mishaps in other Western democracies. Last month, polls of the U.K. election — most of them conducted online — projected a photo-finish for Parliament instead of a Conservative majority.⁴ The polls also had fairly poor results in last year’s Scottish independence referendum and this year’s Israeli general election.

So if the polls fared poorly, does that mean you should have listened to the pundits after all? Not really: In these elections, the speculation among media insiders was usually no better than the polls and was often worse. Almost no one, save perhaps Mick Jagger, assigned much of a chance to the Conservatives’ big win in the U.K. last month, with some betting shops offering odds of 25-to-1 against a Conservative majority. In the last two U.S. elections, meanwhile, the polling error ran in the opposite direction of what the conventional wisdom anticipated. In 2012, there was a lot of media discourse about how polls might be “skewed” against Republicans. As it happened, the polls were skewed that year but toward Republicans, with Democrats beating the predicted outcome in almost every state. Then in 2014, exactly the opposite occurred. The media discourse was mostly about whether the polls would underestimate Democrats again, but instead they were biased toward Democrats.⁵

This may not be a coincidence. The views of pollsters, polling aggregators and pundits may feed back upon one another, even or perhaps especially when they’re incorrect. (When to expect a surprise? When no one expects one.) In fact, there’s increasing evidence of a pollster phenomenon known as “herding.” Toward the end of a campaign, pollsters’ results often fall artificially in line with one another as the conventional wisdom forms about the race. In some cases, pollsters have admitted to suppressing polls they deem to be outliers but that would have turned out to be just right. The U.K. pollster Survation, for instance, declined to release a poll showing Conservatives ahead of Labour by 6 points — about the actual margin of victory — because the results seemed “so ‘out of line’ with all the polling,” the company later disclosed. And in the U.S. last year, at least two polling firms declined to publish surveys showing a tight Senate race in Virginia, which in the end was decided by only 18,000 votes in what was almost a historic upset.

There was a lot of discussion about herding at AAPOR. It’s something that probably always has gone on, to some extent. On the eve of an election, if pollsters have one turnout model that shows a result right in line with the FiveThirtyEight or Real Clear Politics average and another showing a “surprising” result, they may not be eager to risk their reputation by publishing the “outlier.”⁶ But there’s more potential for herding as the fundamentals of polling deteriorate and as polling becomes more technique- and assumption-driven.

The FCC’s proposed restrictions on automated calls could make this worse. At FiveThirtyEight, we’re not huge fans of “robopolls,” which are polls conducted by means of a pre-recorded script. With some exceptions like SurveyUSA, the pollsters who use this tool generally receive poor pollster ratings and are often among the worst herders. But lots of traditional pollsters also use automated dialers to randomly select and dial a number before a human jumps on to conduct the interview. The regulations could make their surveys more expensive as well.

The concern isn’t solely with pre-election “horse race” polls, however. Although they receive a lot of attention, they represent a small fraction of what the public opinion industry does. At AAPOR, there were also representatives of groups ranging from government agencies like the Census Bureau and the Centers for Disease Control and Prevention to commercial measurement groups like the Nielsen Co. All of them rely on random-sample surveys of some kind or another. So do economists; many essential statistics, from the monthly jobs report to consumer confidence figures, are drawn from surveys.

Polls are also essential to understanding public opinion on a host of issues that people never get a chance to vote upon. How do Americans feel about higher taxes on the rich? The Keystone XL pipeline? Abortion? Capital punishment? Obamacare?

Left to their own devices, politicians are not particularly good at estimating prevailing public opinion. Neither, for the most part, are journalists. One reason that news organizations like The New York Times and (FiveThirtyEight partner) ABC News continue to conduct polls — at great expense and at a time when their newsrooms are under budgetary pressure — is as a corrective to inaccurate or anecdotal representations of public opinion made by reporters based mostly in New York and Washington. Polling isn’t a contrast to “traditional” reporting. When done properly, it’s among the most rigorous types of reporting, consisting of hundreds or thousands of interviews with statistically representative members of a particular community.

So then … what should we do if polling is getting harder?

For data-driven journalists like us at FiveThirtyEight, some of the answers are obvious. There’s a lot of reporting and research to be done on under what circumstances polls perform relatively better and worse. Other answers are more esoteric. For instance, if pollsters move away from purely random samples toward other techniques, the error distributions may change too.⁷ At the most basic level, it’s important for news organizations like FiveThirtyEight to continue forecasting elections. It can be easy to sit on the sidelines and criticize after the fact — we’ve done it ourselves at times — but given the pitfalls of hindsight bias that does little to advance knowledge or accountability.⁸

Likewise, it’s essential for polling firms to continue publishing pre-election surveys. While horse-race polls represent a small fraction of all surveys, they provide for relatively rare “natural experiments” by allowing survey research techniques to be tested against objective real-world outcomes.⁹

And the FCC probably ought to go back to policing “wardrobe malfunctions” and not making pollsters’ jobs any harder. Without accurate polling, government may end up losing its most powerful tool to know what the people who elect it really think.

Footnotes

AAPOR was kind enough to give me an award, the Warren J. Mitofsky Innovators Award.
Survey researchers may not be exempt from the new rules, as they historically have been from “Do Not Call” registries.
At AAPOR, I heard a couple of jokes about how the NSA should get into polling.
Note, however, that U.K. statistical models, including the one we published at FiveThirtyEight, had further problems beyond the polling.
We were one of the few outlets to point out this possibility ahead of time.
The paradox of herding, in fact, is that the practice potentially makes an individual company’s poll more accurate while making the polling average less accurate.
One possibility is fat-tailed error distributions in which the polls are pretty good most of the time but prone to occasional catastrophic failures.
Speaking of accountability, one thing we’re proud of at FiveThirtyEight is our practice of open and rigorous self-assessment, especially after an “unexpected” outcome occurs.
We have “hard” data on how last year’s Iowa Senate race turned out, for example, whereas we may never know for certain how many Americans support or oppose abortion rights under which circumstances.

Footnotes

Comments