At FiveThirtyEight, we strive to accumulate and analyze polling data in a way that is honest, informed, comprehensive and accurate. While we do occasionally commission polls, most of our understanding of American public opinion comes from aggregating polling data conducted by other firms and organizations. This data forms the foundation of our polling averages, election forecasts and much of our political coverage.
In building our polling database, we aim to be as inclusive as possible. This means we will collect any poll that has been made publicly available and that meets a few basic standards:
- The poll must include the name of the pollster, survey dates, sample sizes and details about the population sampled. If these are not included in the poll’s release, we must be able to obtain them in order to include the poll.
- Pollsters must also be able to answer basic questions about their methodology, including but not limited to the medium through which their polling was conducted (e.g., landline calls, text, etc.), the source of their voter files and their weighting criteria.
However, there are some types of polls we don’t include, such as:
- “Nonscientific” polls that don’t attempt to survey a representative sample of the population or electorate.
- Polls like Civiqs or Microsoft News tracking polls that are produced using MRP (short for “multilevel regression with poststratification”), a modeling technique that produces estimates based on survey interviews that are then used to calculate probabilities; these probabilities are then projected onto the composition of the population in question. While this is a valid technique for understanding public opinion data, we exclude them because we consider them more like models than individual polls. (As an analogy, we think of this as using someone else’s barbecue sauce as an ingredient in your own barbecue sauce.)
An additional list of edge cases in which we may exclude polls can be found in this article. For instance, we exclude polls that ask voters who they support only after revealing leading information about the candidates.
We do include internal polls that are publicly available, except in one unusual circumstance (a general election poll sponsored by a candidate’s rival in the primary.1) Internal and partisan polls have an asterisk next to the pollster’s name. The asterisk does not indicate whether the pollster itself is partisan, but whether the money that paid for the poll came from a partisan source. Polls are considered partisan if they’re conducted on behalf of a political party, campaign committee, PAC, super PAC, 501(c)(4), 501(c)(5) or 501(c)(6) that conducts a large majority of its political activity on behalf of one political party. Additionally, in cases where a pollster routinely conducts polls for partisan sponsors, and we learn that they’re not being transparent about sponsorship, we will count all of their polls as partisan.
Additionally, if we find that a sponsor organization is selectively releasing polls favorable to a certain candidate or party, we may also categorize that organization as partisan. We generally go out of our way to not characterize news organizations as partisan, even if they have a liberal or conservative view. But selectively releasing data that favors one party is a partisan action, and such polls will be treated as such. These classifications may be revisited if a sponsor ceases engaging in this behavior.
Polls we suspect are fake will also not be included until we conduct a thorough investigation and can confirm their veracity. We will permanently ban any pollster found to be falsifying data. Additionally, we reserve the right to ban polls sponsored by a particular organization that consistently engages in dishonest or nontransparent behavior that goes beyond editorializing and political spin.
Our guidelines for inclusion are intentionally permissive. We aim to capture all publicly available polls that are being conducted in good faith. While some polls have a more established track record of accuracy than others — and we do take that into account in our models and political analysis — we exclude polls from our dataset only in exceptional circumstances.
Below are some questions we’ve been asked often over the years about the types of polls we collect. Take a look and if you still have questions or find a poll we don’t have, please email us at email@example.com.
Frequently Asked Questions
Q: Which races do you collect polls for?
A: We collect polls for presidential, Senate, House and gubernatorial races in addition to presidential approval polls and congressional generic ballot polls at the national level. At this time, we do not collect primary polls other than for the presidency — except in cases of a “jungle primary,” as it’s possible for a candidate to win the seat outright. The latest polls page includes all polls publicly released within two years of an election. If we don’t have any polls for a race in a specific state, that means we weren’t aware of any polls there.
Q: Can I download this data?
A: Yes! There are links at the bottom of the latest polls page, but you can also download this data and more from our data repository for all our polls, forecasts and other data projects. Unfortunately, however, we are not able to share historical data for presidents’ approval ratings. You can find additional information on historical presidential approval ratings, and guidelines for acquiring that dataset, on the Roper Center’s website.
Q: How do you account for a pollster that publishes multiple results for a question?
A: When a pollster publishes multiple subsamples (for example, all adults, only registered voters, and only likely voters), we include all of them in our database. Similarly, if a pollster asks a horse-race question with different sets of candidates — for example, with and without a third-party candidate — we include all versions of this question in our database. If the pollster includes multiple likely voter models, as in this Monmouth poll, we first check if the pollster indicates that one of the versions is its preferred option. If it does, we use that version otherwise, we include them all in our database.
Q: How do you account for a pollster that publishes numbers with and without “leaners”?
A: When a pollster publishes one version of a question with “leaners” — respondents who may be uncertain about their vote but say that they lean toward a particular candidate or party — and one without, we include only the version with leaners. If the question including leaners is a “forced-choice” question, in which respondents are not given an option of saying they are undecided when asked which way they lean, we still include only that version of the question instead of the version without leaners.
Q: Why do you sometimes add old polls to the latest polls page?
A: Polls are added to our latest polls page as soon as they are added to our database. If older polls show up on the polls page, it is because they have recently been added to our database — either because we were only just made aware of them or because they had previously been unreleased.
Q: What do the pollster grades mean?
Q: What does it mean when a pollster has a rating like “A/B” or a dotted circle around the rating?
A: For pollsters with a relatively small sample of polling within three weeks of an election, we show a provisional rating (“A/B,” “B/C,” or “C/D”) rather than a precise letter grade and use a dotted circle to emphasize that the rating is provisional.
Q: Why do some pollsters not have a grade?
A: For some pollsters, we do not have any polls from the last three weeks of an election cycle, which means we can’t evaluate their historical performance. We do include polls from these pollsters in our polling averages and models, but we are unable to assign them a rating based on their historical performance; therefore, they receive less weight in our averages and models.
Q: Do you weight or adjust polls?
A: Yes. When we calculate our polling averages, some polls get more weight than others. For example, polls that survey more people or have a historical track record of accuracy get more consideration in calculating the average than polls with small sample sizes or polls that have been historically less accurate. Additionally, our polling averages apply adjustments for things like a pollster’s house effect (a measure for how consistently a pollster leans toward one party or candidate) or trends in polls from similar states. For more information on how we weight and adjust polls for polling averages, read this detailed explanation of our 2020 methodology.
Q: When do you show third-party candidates on the latest polls page or in your polling averages?
A: We include third-party candidates in polls that ask about them. To find those polls, enter the candidate’s last name in the search box on our latest polls page. As for our averages, if enough pollsters ask questions that include a specific third-party candidate, we will include that candidate.
Q: Why are the sample sizes sometimes missing for polls?
A: If a poll does not have a sample size listed, the pollster or sponsor did not report it and we are actively working to obtain it. These polls are still included in our averages and models with an imputed sample size until we obtain the actual sample size.
Q: Why do the values in some polls add up to more than 100 percent?
A: Values in some polls may add up to more than 100 percent due to rounding. For example, if a pollster published a poll with an approval rating of 46.5 percent and a disapproval rating of 53.5 percent, then both options would automatically round up to 47 percent and 54 percent, respectively.
Q: Why do the margins in some polls not match what the pollster reports?
A: This also boils down to rounding. For example, if a pollster puts one candidate at 45.2 percent and another at 45.6 percent, we’ll display these two candidates at 45 percent and 46 percent, respectively. That means we’ll display their margin as a difference of 0 even though the actual margin is 0.4.
Still have questions? Send us an email and we’ll do our best to sort it out.