Over the coming days and weeks, I’m going to be posting ‘scorecards’ for most of the major polling firms. These are part and parcel of our new pollster ratings and will give you some idea on how we arrive at the numbers that we do. Some polling firms excel at general elections, but struggle in primaries; for others, the opposite is true. Some polling firms have a steep difficulty curve, and might be strong in Presidential elections, but see their polling deteriorate the further downballot one gets; others thrive when polling in more challenging contexts.
Let me show you the scorecard for SurveyUSA, which has an exemplary track record, and then we’ll discuss what everything means. I should warn you that, although we’ve tried to design these charts as ergonomically as possible, it’ll be a bit of data overload for the uninitiated.
The pollster scorecard consists of six mini-charts, each one providing a series of statistics broken down by the type of election, and the electoral cycle in which the survey was conducted. There are six types of races that we track: general elections for (i) President, (ii) the U.S. Senate, (iii) the U.S. House, and for (iv) governor, (v) Presidential primaries, and (vi) Senate/gubernatorial primaries, which are grouped together and which we’ve begun tracking only as of this year. Our database begins in 1998; note that odd-year elections are grouped with the proceeding even-numbered year — for instance, last year’s gubernatorial elections in Virginia and New Jersey are classified as being in the 2010 cycle.
The first two charts are straightforward, and reflect a simple count of the number of polls in our database for the firm (those which were conducted within 21 days of the Election), and the percentage of the weighted sample that they constitute. Note that more recent election cycles are weighted more heavily.
The next chart is “raw error”, which is simply how much the pollster erred on forecasting the margin between the two leading candidates, on average.
There are two ways to screw up your polling, however. You can be imprecise, or you can be biased. The chart to the right of the raw error table gets at the latter problem. It shows the direction in which the error occurred, on average. For instance, a bias score of “D 1.2″ would indicate that the poll missed to the Democratic side by an average of 1.2 points. We should distinguish this from a house effect, which measures the partisan direction in which a poll leans relative to other polls; the bias score instead measures how the polls fared as compared to the actual outcome of the elections. Let’s say that Rasmussen is right that the Democrats are going to get utterly slaughtered in November, and they continue to be much more aggressive about predicting this than other polling firms. If they turn out to be on the money, then their polling — even though it shows a strong house effect — will not be ‘biased’, as we define it here; on the contrary, the results from the other polling firms will be biased! Although house effects are probably predictive of bias, there are also cases in which the entire polling industry has missed the mark. For instance, the average of all polls in the 1998 cycle displayed a 4.6-point Republican bias, while polling in 2002 had a 2.3 point Democratic bias.
The chart in the bottom left-hand corner — “relative error” — is probably the most important one. It shows how a polling firm fared as compared to other firms which surveyed the same elections, or the same types of elections, as determined by the regression analysis embedded in our pollster ratings. Negative numbers (green) are good news: they indicate that the polling firm had less error than its counterparts in the races that it surveyed; positive numbers (red) are bad news. Like the pollster ratings, the relative error chart is intended to reflect true pollster skill, and to strip out errors accounting to sample variance, or temporal variance.
Finally, we have a chart called ‘contribution to score’, which technically speaking is the relative error chart in the bottom LH corner multiplied by the ‘percent of weighted sample’ chart in the top RH corner. It tells you how we arrive at a polling firm’s rawscore (i.e., our estimate of its error/skill before any reversion to the mean). For instance, SurveyUSA’s strong performance in the 2008 Presidential primaries contributed -0.22 points toward its strong rawscore of -0.84.
Now, to be a bit less abstract, let’s notate several features of SurveyUSA’s performance:
— SurveyUSA is very consistent. In each of the six types of elections that we consider, they’ve done better than their counterparts. However, their skill is most concentrated in the more difficult types of races. For instance, they’ve been 1.2 points better than their counterparts when polling House races, and about the same margin when polling primaries. This is oftentimes the emblem of a truly skilled pollster.
— SurveyUSA has also held up well over different political cycles. They kicked butt in 1998, an environment in which many other pollsters struggled. And they’ve been very good from 2004 onward. That leaves 2000 and 2002, in which their performance was only average. But no polling firm is going to nail every cycle; if it can manage to be average, rather than below average, in its off years, that is usually a good sign.
— Over the long run, SurveyUSA has had essentially zero partisan bias. It did have a clear Republican lean in 2000, and a clear Democratic lean in 2002 — not coincidentally, these were its relatively weaker cycles. But generally, its polls have been straight down the middle, and especially so in recent election cycles.
As I hope is obvious, SurveyUSA is a very strong polling firm; no company has done more to contradict the notion that a “robopollster” need be inferior. Although it’s not my place to make any endorsements, it would certainly make the life of electoral forecasters easier if SurveyUSA were to get more business.