The State Of The Polls, 2019

Much maligned for their performance in the 2016 general election — and somewhat unfairly so, since the overall accuracy of the polls was only slightly below average that year by historical standards — American election polls have been quite accurate since then. Their performance was very strong in the 2018 midterms, despite the challenge of having to poll dozens of diverse congressional districts around the country, many of which had not had a competitive election in years. Polls have also generally been accurate in the various special elections and off-year gubernatorial elections that have occurred since 2016, even though those are also often difficult races to poll.

Does that mean everything is looking up in the industry? Well, no. We’ll introduce some complications in a moment. But I do want to re-emphasize that opening takeaway, since the media is just flatly wrong when it asserts that the polls can’t be trusted. In fact, American election polls are about as accurate as they’ve always been. That doesn’t mean polls will always identify the right winner, especially in close elections. (As a simple rule of thumb, we’ve found polls “call” the right winner 80 percent of the time, meaning they fail to do so the other 20 percent of the time — although upsets are more likely to occur in some circumstances than others.) But the rate of upsets hasn’t changed much over time.

Before we go any further, I want to direct you to the latest version of FiveThirtyEight’s pollster ratings, which we’ve updated for the first time since May 2018. They include all polls in the three weeks leading up to every U.S. House, U.S. Senate and gubernatorial general election since then,¹ including special elections, plus a handful of polls from past years that were missing from previous versions of our database. You can find much more detail on the pollster ratings here, including all the polls used in the ratings calculation. Our presidential approval ratings, generic congressional ballot and impeachment trackers have also been updated to reflect these new ratings, although they make little difference to the topline numbers.

Now then, for those complications: The main one is simply that response rates to traditional telephone polls continue to decline. In large part because of caller-ID and call-blocking technologies, it’s simply harder than it used to be to get people to answer phone calls from people they don’t know. In addition to potentially making polls less accurate, that also makes them more expensive, since a pollster has to spend more time making calls for every completed response that it gets. As a result, the overall number of polls has begun to slightly decline. There were 532 polls in our pollster ratings database, which covers polls in the 21 days before elections occur, associated with elections on Nov. 6, 2018, which is down from 558 polls for Election Day 2014 and 692 polls for Election Day 2010.²

FiveThirtyEight Politics Podcast: The races weâÂÂre watching on Election Day 2019

So why not turn to online polls or other new technologies? Well, the problem is that in recent elections, polls that use live interviewers to call both landlines and cellphones continue to outperform other methods, such as online and automated (IVR) polls. Moreover, online and IVR polls are generally more prone toward herding — that is, making methodological choices, or picking and choosing which results they publish, in ways that make their polls match other, more traditional polls. So not only are online and automated polls somewhat less accurate than live-caller polls, but they’d probably suffer a further decline in accuracy if they didn’t have live polls to herd toward.

Still, online polling is undoubtedly a large part of polling’s future — and some online polling firms are more accurate than others. Among the most prolific online pollsters, for example, YouGov stands out for being more accurate than others such as Zogby, SurveyMonkey, and Harris Insights & Analytics. And many former IVR pollsters are now migrating to hybrid methods that combine automated phone polling with internet panels. In the 2018 elections, this produced better results in some cases (e.g., SurveyUSA) than in others (e.g., Rasmussen Reports).

Polls have been quite accurate — and unbiased — in post-2016 elections

Each time we update our pollster ratings, we publish a few charts that depict the overall health of the industry — so let’s go ahead and run the numbers again. The first chart is the one we consider to be the most important: the average error of polls broken down by the type of election. A few quick methodological notes:

By average error, I mean the difference between the margin projected by the poll and the actual election result. For instance, if the poll shows the Democrat up by 1 percentage point and the Republican wins by 2 points, that would be a 3-point error.
To not give any one polling firm too much influence, the values in the chart are weighted based on the number of polls a particular pollster conducted for that particular type of election in that particular cycle³
Polls that are banned by FiveThirtyEight because we know or suspect that they faked data are excluded from the analysis.
Note that I’ve included the handful of elections that have occurred so far in 2019 with the 2017-18 election cycle, even though we’ll classify them them later as part of the 2019-20 cycle instead.

OK, here’s the data:

Post-2016 polls have been accurate by historical standards

Weighted-average error of polls in final 21 days before the election, among polls in FiveThirtyEight’s Pollster Ratings database

				Presidential
Cycle	Governor	U.S. Senate	U.S. House	General	Primary	Combined
1998	8.2	7.4	6.8			7.6
1999-2000	4.9	6.1	4.4	4.4	7.6	5.5
2001-02	5.2	4.9	5.4			5.2
2003-04	6.0	5.6	5.4	3.2	7.1	4.8
2005-06	5.0	4.2	6.5			5.3
2007-08	4.1	4.7	5.7	3.6	7.4	5.4
2009-10	4.9	4.8	6.9			5.7
2011-12	4.9	4.7	5.1	3.6	8.9	5.2
2013-14	4.6	5.5	6.5			5.4
2015-16	5.4	5.0	5.5	4.8	10.1	6.7
2017-19	5.3	4.3	5.0			5.0
All years	5.4	5.3	6.1	4.0	8.7	5.8

As I said, the 2017-19 cycle was one of the most accurate on record for polling. The average error of 5.0 points in polls of U.S. House elections is the second-best in our database, trailing only 1999-2000. The 4.3-point error associated with U.S. Senate elections is also the second-best, slightly trailing 2005-06. And gubernatorial polls had an average error of 5.3 points, which is about average by historical standards.

Combining all different types of elections together, we find that polls from 2017 onward have been associated with an average error of 5.0 points, which is considerably better than the 6.7-point average for 2015-16, and the best in any election cycle since 2003-04.

But note that there’s just not much of an overall trajectory — upward or downward — in polling accuracy. Relatively strong cycles for the polls can be followed by relatively weak ones, and vice versa.

One more key reminder now that the Iowa caucuses are only three months away: Some types of elections are associated with considerably larger polling errors than others. In particular, presidential primaries feature polling that is often volatile at best, and downright inaccurate at worst. Overall, presidential primary polls in our database mispredict the final margin between the top two candidates by an average of 8.7 points. And the error was even worse, 10.1 points, in the 2016 primary cycle. Leads of 10 points, 15 points or sometimes more are not necessarily safe in the primaries.

We can also look at polling accuracy by simply counting up how often the candidate leading in the poll wins his or her race.⁴ This isn’t our preferred method, as it’s a bit simplistic — if a poll had the Republican ahead by 1 point and the Democrat won by 1 point, that’s a much more accurate result than if the Republican had won by 20, even though it would have incorrectly identified the winner. But across all polls in our database, the winner was “called” 79 percent of the time.

Polls “call” the winner right 79 percent of the time

Weighted-average share of polls that correctly identified the winner in final 21 days before the election, among polls in FiveThirtyEight’s Pollster Ratings database

				Presidential
Cycle	Governor	U.S. Senate	U.S. House	General	Primary	Combined
1998	86%	86%	57%			78%
1999-2000	80	80	56	68%	95%	76
2001-2002	87	87	77			82
2003-2004	76	76	69	78	94	79
2005-2006	89	89	71			83
2007-2008	95	95	83	94	80	88
2009-2010	85	85	75			82
2011-2012	90	90	70	81	63	77
2013-2014	80	80	76			77
2015-2016	68	68	57	71	86	77
2017-2019	77	77	78			76
All years	82	82	72	79	83	79

In recent elections, the winning percentage has been slightly below the long-term average — it was 76 percent in 2017-19. But this reflects the recent uptick in close elections, and that resource-constrained pollsters tend to poll these close elections more heavily.⁵

As basic as this analysis is, it’s essential to remember that polls are much more likely to misidentify the winner when they show a close race. Polls in our database that showed a lead of 3 percentage points or less identified the winner only 58 percent of the time — a bit better than random chance, but not much better. But polls showing a 3- or 6-point lead were right 72 percent of the time, and those with a 6- or 10-point lead were right 86 percent of the time. (Errors in races showing double-digit leads are quite rare in general elections, although they occur with some frequency in primaries. And errors in races where one candidate leads by 20 or more points are once-in-a-blue-moon types of events, regardless of the type of election.)

Polls often misidentify the winner in a close race

Share of polls that correctly identified the winner in final 21 days before the election, among polls in FiveThirtyEight’s Pollster Ratings database

Leading candidate’s margin	Share of polls correctly identifying winner
0-3 points	58%	– –
3-6 points	72	– –
6-10 points	86	– –
10-15 points	94	– –
15-20 points	98	– –
≥20 points	>99	– –

Another essential measure of polling accuracy is statistical bias — that is, whether the polls tend to miss in the same direction. We’re particularly interested in understanding whether polls systematically favor Democrats or Republicans. Take the polls in 2016, for instance. Although they weren’t that bad from an accuracy standpoint, the majority underestimated President Trump and Republicans running for Congress and governor, leading them to underestimate how well Trump would do in the Electoral College. Overall in the 2015-16 cycle, polls had a Democratic bias (meaning they overestimated Democrats and underestimated Republicans) of 3.0 percentage points. And that after a 2013-14 cycle when polls also had a Democratic bias (of 2.7 percentage points).

Polling bias is not very consistent from cycle to cycle

Weighted-average statistical bias of polls in final 21 days of the election, among polls in FiveThirtyEight’s Pollster Ratings database

Cycle	Governor	U.S. Senate	U.S. House	Pres. General	Combined
1998	R+5.7	R+4.8	R+1.5		R+4.2
1999-2000	D+0.6	R+2.9	D+0.9	R+2.6	R+1.8
2001-2002	D+3.0	D+1.4	D+1.3		D+2.2
2003-2004	R+4.2	D+1.7	D+2.5	D+1.1	D+0.9
2005-2006	D+0.3	R+1.3	D+0.2		R+0.1
2007-2008	D+0.5	D+0.8	D+1.0	D+1.1	D+1.0
2009-2010		R+0.7	D+1.7		D+0.6
2011-2012	R+1.3	R+3.3	R+2.6	R+2.5	R+2.6
2013-2014	D+2.3	D+2.5	D+3.7		D+2.7
2015-2016	D+3.3	D+2.8	D+3.7	D+3.1	D+3.0
2017-2019	R+0.9	D+0.1	R+0.3		R+0.3
All years	D+0.3	D+0.1	D+0.7	D+0.2	D+0.3

In 2017-19, however, polls had essentially no partisan bias, and to the extent there was one, it was a very slight bias toward Republicans (0.3 percentage points). And that’s been the long-term pattern: Whatever bias there is in one batch of election polls doesn’t tend to persist from one cycle to the next. The Republican bias in the polls in 2011-12, for instance, which tended to underestimate then-President Obama’s re-election margins, was followed by two cycles of Democratic bias in 2013-14 and 2015-16, as previously mentioned. There is simply not much point in trying to guess the direction of poll bias ahead of time; if anything, it often seems to go against what the conventional wisdom expects. Instead, you should always be prepared for the possibility of systematic polling errors of several percentage points in either direction.

Which pollsters have been most accurate in recent elections?

Although it can be dangerous to put too much stock in the performance of a pollster in a single election cycle — it takes dozens of polls to reliably assess a pollster’s accuracy — it’s nonetheless worth briefly remarking on the recent performance of some of the more prolific ones. Below, you’ll find the average error, statistical bias and a calculation we call Advanced Plus-Minus (basically, how the pollster’s average error compares to other pollsters’ in the same election),⁶ for pollsters with at least five polls in our database for the 2017-19 cycle. Note that negative Advanced Plus-Minus scores are good; they indicate that a firm’s polls were more accurate than others in the same races.

How prolific pollsters have fared in recent elections

Advanced Plus-Minus scores and other metrics for pollsters who conducted at least five surveys for the 2017-19 cycle, in FiveThirtyEight’s Pollster Ratings database

Pollster	Methodology	No. of Polls	Avg. Error	Bias	Adv. Plus-Minus
ABC News/Washington Post	Live	5	1.7	R+0.9	-4.1
Cygnal	IVR/Online/Live	9	2.5	D+1.9	-3.7
Mason-Dixon Polling & Research Inc.	Live	7	2.8	R+1.0	-3.0
Monmouth University	Live	9	3.1	R+1.7	-2.9
Suffolk University	Live	7	2.7	R+1.3	-2.7
Research Co.	Online	20	3.8	R+1.1	-2.3
Mitchell Research & Communications	IVR/Online	6	2.5	R+0.9	-2.0
Siena College/New York Times Upshot	Live	47	3.6	R+1.3	-1.7
Emerson College	IVR/Online	66	4.2	R+0.5	-1.5
Marist College	Live	13	4.4	D+2.7	-1.1
Landmark Communications	IVR/Online/Live	5	4.1	D+3.9	-1.0
YouGov	Online	12	3.1	R+1.7	-1.0
SurveyUSA	IVR/Online/Live	13	4.1	R+0.7	-1.0
Gravis Marketing	IVR/Online/Live	25	3.8	D+0.6	-0.8
Harris Insights & Analytics	Online	34	3.7	R+0.2	-0.2
Vox Populi Polling	IVR/Online	7	4.5	D+3.6	+0.0
St. Pete Polls	IVR	10	2.3	D+1.7	+0.0
Fox News/Anderson Robbins Research/Shaw & Co. Research	Live	10	4.7	D+2.7	+0.0
Remington Research Group	IVR/Live	5	4.1	D+3.1	+0.3
Change Research	Online	57	5.5	D+1.5	+0.6
Quinnipiac University	Live	13	4.3	D+2.7	+0.7
JMC Analytics/Bold Blue Campaigns	Live	5	6.7	R+5.5	+0.9
SSRS	Live	11	5.2	D+4.3	+0.9
Optimus	IVR/Online/Live/Text	5	6.8	R+6.8	+0.9
Strategic Research Associates	Live	5	5	D+1.9	+1.0
Susquehanna Polling & Research Inc.	IVR/Live	6	8.6	D+8.0	+1.4
Trafalgar Group	IVR/Online/Live	21	4.6	R+1.9	+1.6
Ipsos	Online	10	5.3	R+3.0	+2.2
Rasmussen Reports/Pulse Opinion Research	IVR/Online	5	6.1	R+5.8	+3.2
Carroll Strategies	IVR	5	9.9	R+9.9	+3.4
Dixie Strategies	IVR/Live	5	8.4	R+5.9	+3.8

Four of the top 5 and 6 of the 10 best pollsters according to this metric were exclusively live-caller telephone polls. In exciting news for fans of innovative polling, the list includes polls from our friends at The New York Times’s Upshot, who launched an extremely successful and accurate polling collaboration with Siena College in 2016. (It also includes ABC News, FiveThirtyEight’s corporate parent, which usually conducts its polls jointly with The Washington Post.)

Conversely, the five of the top six worst-performing pollsters — including firms such as Carroll Strategies, Dixie Strategies, and Rasmussen Reports/Pulse Opinion Research — were IVR pollsters (sometimes in conjunction with other methods), several of which had strong Republican leans in 2017-19. Some IVR pollsters did perform reasonably well in 2015-16, a cycle where most pollsters underestimated Republicans. In retrospect, though, that may have been a case of two wrongs making a right; IVR polls tend to be Republican-leaning, so they’ll look good in years where Republicans beat their polls, but they’ll often be among the worst polls otherwise.

Indeed, aggregating the pollsters by methodology confirms that live caller polls continue to be the most accurate. Below are the aggregate scores for the three major categories of polls — live caller, online, and IVR — by our Advanced Plus-Minus metric, average error and statistical bias.⁷

Live-caller polls have been most accurate in recent elections

Advanced Plus-Minus scores and other metrics for pollsters who conducted at least five surveys for the 2017-19 cycle, in FiveThirtyEight’s Pollster Ratings database

Methodology	No. of Polls	Avg. Error	Bias	Adv. Plus-Minus
Live caller w/cell	356	4.9	R+0.5	-0.3
Live caller w/cell only	210	4.4	R+0.2	-0.8
Live caller w/cell hybrid	146	5.5	R+0.9	+0.4
IVR	239	5.2	R+1.0	+0.3
IVR only	19	6.9	R+5.4	+2.4
IVR hybrid	220	5	R+0.4	+0.1
Online or text	358	5	R+0.4	+0.2
Online or text only	154	5	D+0.4	+0.5
Online or text hybrid	204	5	R+0.8	+0.1
All polls	628	5	R+0.3	+0.0

The differences are clearest when looking at pollsters that exclusively used one method. Polls that exclusively used live callers (including calling cellphones) had an average error of 4.4 percentage points in the 2017-19 cycle, as compared to 5.0 points for polls exclusively conducted online or via text message, and 6.9 points for polls that exclusively used IVR. (Pure IVR polls, however, are now quite rare. Polls that used a hybrid of IVR and other methods did better, with an average error of 5.0 percentage points.)

Polling firms that are members of professional polling organizations that push for transparency and other best practices also continue to outperform those that aren’t. In particular, our pollster ratings give credit to firms that support the American Association for Public Opinion Research (AAPOR) Transparency Initiative, belong to the National Council on Public Polls (NCPP), or contribute data to the Roper Center archive. Pollsters that are part of one or more of these initiatives had an average error of 4.3 percentage points in the 2017-19 cycle, as compared to 5.4 percentage points for those that aren’t.

Another way to detect herding

Our pollster ratings have also long included an adjustment to account for the fact that online and automated polls tend to perform better when there are high-quality polls in the field. We’ve confirmed that this still applies. For instance, polls that are conducted online or via IVR⁸ are about 0.4 percentage points more accurate based on our Advanced Plus-Minus metric when their polls are preceded by “gold standard” polls in the same race. (“Gold standard” is the term we use for pollsters that are exclusively live caller with cellphones and are also AAPOR/NCPP/Roper members.) Live-caller polls do not exhibit the same pattern, however; their Advanced Plus-Minus score is unaffected by the existence of an earlier “gold standard” poll in the field. This is probably the result of herding; some of the lower-quality pollsters may be doing the equivalent of peeking at their more studious classmate’s answers in a math test. In fact, these differences are especially strong in recent elections, suggesting that herding has become more of a problem.

There is also a second, more direct method to detect herding, which we’re also now applying in our pollster ratings. Namely — as described in this story — there is a minimum distance that a poll should be from the average of previous polls based on sampling error alone. For instance, even if you knew that a candidate was ahead 48-41 in a particular race — a 7-point lead — you’d miss that margin by an average of about 5 percentage points in a 600-person poll because sampling only 600 people rather than the entire population introduces sampling error. That is, because of sampling error, some polls would inevitably show a 12-point lead and some would show a 2-point lead instead of all the polls being bunched together at a 6- or 7- or 8-point lead exactly. If the polls are very tightly bunched together, this is not a good thing — you should be suspicious of herding, which can sometimes yield embarrassing outcomes where every poll gets the answer wrong

Of course, there are other complications in the real world. There’s no guarantee that the race will have been static since other pollsters surveyed the race; one candidate may be losing or gaining ground. And pollsters have healthy methodological disagreements from one another, so the same race may look different depending on what assumptions they make about turnout and so forth. But these should tend to increase the degree to which polls differ from each other, and not produce herding.

But our herding penalty only applies if pollsters show too little variation from the average of previous polls of the race⁹ based on sampling error¹⁰ alone. If a pollster is publishing all its data without being influenced by other pollsters — including its supposed outliers — it should be fairly easy to avoid this penalty over the long run.

Many polls are closer to the average of previous polls than they “should” be, however. Unlike the previous type of herding I described, which is concentrated among lower-quality pollsters who are essentially trying to draft off their neighbors to get better results, this tendency appears among some higher-quality pollsters as well. In some cases, we suspect, this is because, late in the race, a pollster doesn’t want to deal with the media firestorm that would inevitably ensue if it published a poll that appears to be an outlier. In other cases, frankly, we suspect that pollsters rather explicitly look at the FiveThirtyEight or RealClearPolitics polling average and attempt to match it.

In any event, our formula now detects this type of herding, and it results in a lower pollster rating when we catch it.¹¹. Our pollster ratings spreadsheet now calculates each pollster’s Average Distance from Polling Average, or ADPA, which is how much the pollster’s average poll differs from the average of previous polls of that race.¹² Among pollsters with at least 15 polls,¹³ the largest herding penalties are as follows:

Which pollsters show the clearest signs of herding?

Pollster	Herding Penalty
Research Co.	1.17
Muhlenberg College	0.84
Angus Reid Global	0.82
Grove Insight	0.71
NBC News/Wall Street Journal	0.53

Other methodological changes

Unless you’re really into details — or you’re a pollster! — you probably aren’t going to care about these … but there are a few other methodological changes we’ve made to our pollster ratings this year.

Previously, pollsters got a bonus if they exclusively conducted their polls via live callers with cellphones, since these have been the most accurate polls over time. But this year, if a pollster uses live-caller-with-cellphone polls in combination with other methodologies, we now give them partial credit for the live-caller bonus. Even though these hybrid polls did not have a particularly good performance in 2017-19, they’ve been reasonably strong in the long run; also, we’re bowing to the reality that many formerly live pollsters are increasingly incorporating online or other methods into their repertoire.
In determining whether a poll’s result fell into or outside the margin of error, a calculation that’s available in our spreadsheet, we now use a more sophisticated margin of error formula that accounts for the percentages of the top two candidates and not just the distance between them. The margin of error is smaller in lopsided races, e.g., when one candidate leads 70-20.
Our Predictive Plus-Minus scores and pollster letter grades are based on a combination of a pollster’s empirical performance (how accurate it has been in the past) and its methodological characteristics. The more polls a firm has conducted, the more the formula weights its performance rather than its methodological prior. In assigning the weights, our formula now considers how recent a particular firm’s polls were. In other words, if a pollster has conducted a lot of surveys recently, its empirical accuracy will be more heavily weighted. But if most of its polling is in the distant past, its pollster rating will gradually revert toward the mean based on its methodology.
For pollsters with a relatively small sample of polling, we now show a provisional rating rather than a precise letter grade. (An “A/B” provisional rating means that the pollster has shown strong initial results, a “B/C” rating means it has average initial results, and a “C/D” rating means below-average initial results.) It now takes roughly 20 recent polls (or a larger number of older polls) for a pollster to get a precise pollster rating.

That’s all for now! Once again, you can find an interactive version of the pollster ratings here, and a link with further detail on them here. And if you have questions about the pollster ratings, you can always reach us here. Good luck to pollsters on having a strong performance in the primaries.

Footnotes

We exclude polls of last November’s U.S. House race in North Carolina’s 9th Congressional District, where the results were not certified due to allegations of election fraud.
These totals exclude polls that are banned by FiveThirtyEight because we suspect them to be fake.
Specifically, the weights are based on the square root of the number of polls that a firm conducted. For instance, a pollster that conducted 9 polls would be weighted 3 times as much as a pollster that conducted a single poll.
Pollsters get half-credit if they show a tie for the lead and one of the leading candidates wins.
In the 2017-19 cycle, the average poll was conducted in a race where the eventual margin of victory was 8.9 percentage points, as compared with 11.7 percentage points in the 2005-06 cycle, for example.
Or the same type of election in elections where there weren’t very many pollsters.
The weights are based on the square root of the number of polls that a firm conducted. Polls banned by FiveThirtyEight are excluded.
Or via online and IVR combined
Specifically, we look at the average of previous polls of the race in the polling database where the median field date was at least 3 days earlier. The average includes only the most recent poll from each pollster in each race and excludes partisan polls and pollsters FiveThirtyEight has banned.
Both the sampling error of the poll and the sampling error of the polling average it’s being compared to are used in the calculation.
Specifically, one-half of the difference between a pollster’s actual ADPA (described below) and its theoretical minimum ADPA is applied as a penalty to a pollster’s Advanced Plus-Minus before calculating its Predictive Plus-Minus.
Weighted based on the square root of the number of other polls in the field.
Specifically, 15 polls for which an average of previous polls is available.

FiveThirtyEight

The State Of The Polls, 2019

Polls just had one of their best election cycles, ever — but challenges abound in the industry

FiveThirtyEight Politics Podcast: The races weâÂÂre watching on Election Day 2019

Polls have been quite accurate — and unbiased — in post-2016 elections

Post-2016 polls have been accurate by historical standards

Polls “call” the winner right 79 percent of the time

Polls often misidentify the winner in a close race

Polling bias is not very consistent from cycle to cycle

Which pollsters have been most accurate in recent elections?

How prolific pollsters have fared in recent elections

Live-caller polls have been most accurate in recent elections

Another way to detect herding

Which pollsters show the clearest signs of herding?

Other methodological changes

Footnotes

Comments

FiveThirtyEight Politics Podcast: The races weâÂÂre watching on Election Day 2019

Polls have been quite accurate — and unbiased — in post-2016 elections

Which pollsters have been most accurate in recent elections?

Another way to detect herding

Other methodological changes

Footnotes

Comments

FiveThirtyEight Politics Podcast: The races weâÂÂre watching on Election Day 2019