Which Pollsters To Trust In 2018

As FiveThirtyEight has evolved over the past 10 years, we’ve taken an increasingly “macro” view of polling. By that, I mean: We’re more interested in how the polls are doing overall — and in broad trends within the polling industry — and less in how individual polls or pollsters are performing. As we described in an article earlier this week, overall the polls are doing … all right. Contrary to the narrative about the polls, polling accuracy has been fairly constant over the past couple of decades in the U.S. and other democratic countries.

Still, in election coverage, the “micro” matters too, and our newly updated pollster ratings — in which we evaluate the performance of individual polling firms based on their methodology and past accuracy — are still a foundational part of FiveThirtyEight. They figure into the algorithms that we design to measure President Trump’s approval ratings and to forecast elections (higher-rated pollsters get more weight in the projections). And sometimes those pollster ratings can reveal broad trends too: For example, after a reasonably strong 2012, online polls were fairly inaccurate in 2016.

The ratings also allow us to measure pollster performance over a large sample of elections — rather than placing a disproportionate amount of emphasis on one or two high-profile races. For instance, Rasmussen Reports deserves a lot of credit for its final, national poll of the 2016 presidential election, which had Hillary Clinton ahead by 2 percentage points, almost her exact margin of victory in the popular vote. But Rasmussen Reports polls are conducted by a Rasmussen spinoff called Pulse Opinion Research LLC, and state polls conducted by Rasmussen and Pulse Opinion Research over the past year or two have generally been mediocre.

So which pollsters have been most accurate in recent elections? Because some races are easier to poll than others, we created a statistic called Advanced Plus-Minus to evaluate pollster performance. It compares a poll’s accuracy to other polls of the same races and the same types of election. Advanced Plus-Minus also adjusts for a poll’s sample size and when the poll was conducted. (For a complete description, see here; we haven’t made any changes to our methodology this year.) Negative plus-minus scores are good and indicate that the pollster has had less error than other pollsters in similar types of races.

The table below contains Advanced Plus-Minus scores for the most prolific pollsters — those for whom we have at least 10 polls in our database for elections from Nov. 8, 2016 onward. These polls cover the 2016 general election along with any polling in special elections or gubernatorial elections since 2016.

How prolific pollsters have fared in recent elections

Advanced Plus-Minus scores for pollsters’ surveys conducted for elections on Nov. 8, 2016, and later

pollster	Methodology	No. of Polls	Avg. Error	Advanced Plus-Minus	Bias
Monmouth University	Live	24	4.8	-1.5	D+3.9
Emerson College	IVR	51	4.1	-1.0	D+1.2
Siena College	Live	18	4.0	-0.9	D+1.5
Landmark Communications	IVR/online	14	4.4	-0.6	D+4.3
Marist College	Live	17	3.7	-0.6	D+1.5
Lucid	Online	14	2.6	-0.4	D+2.4
SurveyUSA	IVR/online/live	18	4.5	-0.2	D+1.0
Trafalgar Group	IVR/online/live	15	4.0	-0.1	R+0.8
YouGov	Online	33	4.3	+0.0	D+2.8
Opinion Savvy	IVR/online	11	4.3	+0.1	D+2.8
Quinnipiac University	Live	26	4.4	+0.1	D+4.2
Rasmussen Reports/Pulse Opinion Research	IVR/online	55	5.1	+0.4	D+3.6
CNN/Opinion Research Corp.	Live	10	4.3	+0.6	D+1.4
Gravis Marketing	IVR/online	53	4.6	+0.7	D+2.5
Remington Research Group	IVR/live	32	4.9	+0.8	D+2.1
Public Policy Polling	IVR/online	28	5.2	+1.0	D+5.2
SurveyMonkey	Online	195	7.3	+2.3	D+5.6
University of New Hampshire	Live	19	8.9	+3.4	D+8.9
Google Surveys	Online	12	8.4	+5.0	D+1.8

The best of these pollsters over this period has been Monmouth University, which has an Advanced Plus-Minus score of -1.5. That’s not a huge surprise — Monmouth was already one of our highest-rated pollsters. After that, the list is somewhat eclectic, including traditional, live-caller pollsters such as Siena College and Marist College, as well as automated pollsters such as Emerson College and Landmark Communications. Polling institutes run by colleges and universities are somewhat overrepresented among the high performers on the list and have generally become a crucial source of polling as other high-quality pollsters have fallen by the wayside.

The lowest-performing pollsters in this group are the University of New Hampshire’s Survey Center, Google Surveys and SurveyMonkey. UNH uses traditional telephone interviewing, but its polls were simply way off the mark in 2016, overestimating Democrats’ performance by an average of almost 9 percentage points in the polls it conducted of New Hampshire and Maine.

Google Surveys and SurveyMonkey are newer and more experimental online-based pollsters. Google Surveys has an unusual methodology in which it shows people a poll in lieu of an advertisement and then infers respondents’ demographics based on their web browsing habits. While national polls that used the Google Surveys platform got fairly good results both in 2012 and 2016, state polls that used this technology have generally been highly inaccurate. Some Google Surveys polls also have a highly do-it-yourself feel to them, in that members of the public can use the Google Surveys platform to create and run their own surveys. We at FiveThirtyEight are going to have to do some thinking about whether to include these types of do-it-yourself polls in our averages and forecasts.

SurveyMonkey, which sometimes partners with FiveThirtyEight on non-election-related polling projects, conducted polling in all 50 states in 2016, asking about both the presidential election and races for governor and the U.S. Senate. Unlike some other attempts to poll all 50 states,¹ SurveyMonkey took steps to ensure that each state was weighed individually and that respondents to the poll were located within the correct state. Thus, FiveThirtyEight treated these polls as we did any other state poll. Unfortunately, the results just weren’t good, with an average error² of 7.3 percentage points and an Advanced Plus-Minus score of +2.3.

It wasn’t just Google Consumer Surveys or SurveyMonkey, however — overall, online polls (with some exceptions such as YouGov and Lucid) have been fairly unreliable in recent elections. So have the increasing number of polls that use hybrid or mixed methodologies, such as those that mostly poll using automated calls (also sometimes called IVR or interactive voice response) but supplement these results using an online panel.

In the chart below, I’ve calculated Advanced Plus-Minus scores and other statistics based on the technologies the polls used. An increasing number of polling firms no longer fall cleanly into one category and instead routinely use more than one mode of data collection within the same survey or switch back and forth from one methodology to the next from poll to poll. Therefore, I’ve distinguished polls that use one methodology exclusively from those that employ mixed methods.

Online polls have been less accurate in recent elections

Advanced Plus-Minus scores for pollsters’ surveys conducted for elections on Nov. 8, 2016, and later

Poll type	No. of Polls	Average Error	Adv. Plus-Minus	Bias
Live caller	77	4.9	+0.1	D+2.2
Live caller only	62	4.8	-0.1	D+2.5
Live caller hybrid	15	5.2	+0.7	D+1.2
IVR	35	4.6	-0.0	D+2.0
IVR only	13	4.5	-0.7	D+0.8
IVR hybrid	17	4.7	+0.4	D+2.6
Online	32	5.3	+1.1	D+3.0
Online only	15	5.4	+1.6	D+3.3
Online hybrid	17	5.1	+0.7	D+2.8
All pollsters	119	4.9	+0.3	D+2.4

The clearest trends are that telephone polls — including both live caller and IVR polls — have outperformed online polls in recent elections and that polls using mixed or hybrid methods haven’t performed that well.

The relatively strong performance of IVR polls is surprising, considering that automated polls are not supposed to call cellphones and that more than half of U.S. households are now cellphone-only. It ought to be difficult to conduct a representative survey given that constraint.

We’ve sometimes seen the claim that IVR polls are more accurate because people are more honest about expressing support for “politically incorrect” candidates such as Trump when there isn’t another human being on the other end of the phone. This feeling of greater anonymity would presumably also apply to online polls, however, and online polls have not been very accurate lately (and they tended to underestimate Trump in 2016).

Another answer may be that the IVR polls were more lucky than good in 2016. In general, online polls tend to show more Democratic-leaning results, IVR polls tend to show more Republican-leaning results, and live-caller polls are somewhere in between. Thus, in years such as 2012 when Democratic candidates beat the polling averages, online polls tend to look good, and in years when Republicans outperform their polls, IVR polls look good. If undecided voters largely broke to Trump in 2016, polls that initially had too many Republicans in their samples would wind up performing well.

Over the long run, the highest-performing pollsters have been those that:

Exclusively use live-caller interviews, including calls placed to cellphones, and
Participate in professional initiatives that encourage transparency and disclosure.³

FiveThirtyEight’s pollster ratings will continue to award a modest bonus to pollsters that meet one or both of these standards and apply a modest penalty to those that don’t. Thus, the letter grades you see associated with polling firms are based on a combination of their historical accuracy and their methodological standards. Polling firms with non-standard methodologies can sometimes have individual races or even entire election cycles in which they perform quite well. But they don’t always sustain their performance over the long run.

As for online polls, we don’t want to discourage experimentation or to draw too many conclusions from just one cycle’s worth of polling. But we at FiveThirtyEight are becoming skeptical of what you might call bulk or “big data” approaches to polling using online platforms. The polling firms that get the best results tend to be those that poll no more than about six to eight states and put a lot of thought and effort into every poll. Online firms may want to do less national polling and fewer 50-state experiments and concentrate more on polling in electorally important states and congressional districts. Results in these contests will go a long way toward determining whether online polling is an adequate substitute for telephone polling.

Footnotes

Other pollsters published results from all 50 states, but they were equivalent to demographic cross-tabs rather than individually weighted polls of each state. The pollsters that did this include Ipsos and Google Consumer Surveys. We aren’t including their state “polls” in the pollster ratings database, but if we had included them, Ipsos’s state-by-state “polls” would have received about an average rating, while the Google Consumer Surveys state “polls” were highly inaccurate and would have rated extremely poorly.
The average error is simply the difference between the vote share margin in the poll and the actual results. For instance, if the poll had the Democrat ahead by 1 point and the Republican won by 3 points, it would be a 4-point error.
Specifically, which participate in the American Association for Public Opinion Research’s transparency initaitive, are members of the National Council on Public Polls or contribute data to the Roper Center for Public Opinion Research’s data archive.

Related: Politics Podcast

Politics Podcast: A Conversation About Our Pollster Ratings

Footnotes

Comments