On Transparency, Hypocrisy, and Research 2000

Taegan Goddard has an article up entitled “Where’s the Transparency in Pollster Ratings?”. It’s a short item so I’m going to post it in full:

Nate Silver’s pollster scorecard is an interesting experiment in trying to hold the political polling industry to a higher standard. It’s long overdue and could prove very useful to consumers of this information.

In explaining his methodology, Silver found that “the scores of polling firms which have made a public commitment to disclosure and transparency hold up better over time. If they were strong before, they were more likely to remain strong; if they were weak before, they were more likely to improve.”

But when I talk with pollsters about the latest scorecard, they’re universally puzzled as to why Silver doesn’t hold himself to the same level of transparency and release his database of polls. In fact, some even claim he’s using faulty data in putting together his rankings.

While Silver’s efforts are admirable — and even caused one of the more controversial firms to vanish from the scene — it’s a point worth considering before giving his pollster rankings too much weight.

Where’s the transparency? Well, it’s here, in an article that contains 4,807 words and 18 footnotes. Literally every detail of how the pollster ratings are calculated is explained. It’s also here, in the form our Pollster Scorecards, a feature which we’ll continue to roll out over the coming weeks for each of the major polling firms, and which will explain in some detail how we arrive at the particular rating that we did for each one.

Taegan does ask a good question about why the complete polling database has not been released publicly. The principal reason because I don’t know that I’m legally entitled to do so. The polling database was compiled from approximately eight or ten distinct data sources, which were disclosed in a comment which I posted shortly after the pollster ratings were released, and which are detailed again at the end of this article.* These include some subscription services, and others from websites that are direct competitors of this one. Although polls contained in these databases are ultimately a matter of the public record and clearly we feel as though we have every right to use them for research purposes, I don’t know what rights we might have to re-publish their data in full. Nor do I know whether doing so would be fair or wise– it is certainly not my intention to undermine PollingReport.com’s business model, for instance. But essentially, the database is something which, albeit with considerable time and effort, and a small out-of-pocket expenditure, anybody could re-create for themselves.

***

I understand that people don’t like to have the quality their work judged. I’m sure that the first newspaper which printed a summary of batting averages took a lot of heat for it too.

It is pretty ironic, however, that the vehicle for criticisms about openness and transparency was an anonymously-sourced item in Taegan’s newsletter.

Tagean, who were your sources?

Pollsters, which of you talked to Taegan? Which of you left Taegan with the impression that I’m using “faulty data in putting together [my] rankings”, and what did you mean by that?

***

Whether coincidentally or not, the tone of the criticisms in Taegan’s note mirrors those which Del Ali of Research 2000 has made to me in a series of e-mails and phone calls. Research 2000 was dismissed today as the polling firm for Daily Kos.

Markos Moulitsas is a friend — readers with long memories will recall that I got my start in political writing as a diarist on the Daily Kos website, originally under the pseudonym “poblano” — and I consulted him about his decision.

There were two “bursts” of communications between Markos and I, one obviously having come in the last week or so and surrounding the publication of the pollster ratings. The other came in early February of this year, and followed the publication of the Research 2000’s headline-grabbing poll of registered Republicans. Although the nature of the poll was unusual, I and others found that it contained results that were inconsistent with those released by other pollsters on similar questions and it led me to start treating Research 2000 polling with more scrutiny.

I also pointed out to Markos that Research 2000 polling has a significant Democratic-leaning house effect, something which is problematic for any pollster, and one which someone should arguably be especially sensitive toward when one runs a Demoratic website. I have certainly posed plenty of questions to Rasmussen, whose polls have had similarly had a Republican-leaning house effect.

I discussed with Markos that Research 2000 had made some poor choices in areas like question wording for some of their other clients.

I pointed out to Markos that Research 2000’s polls has been, from a quantitative perspective, on a downward trajectory, and shared with him an advanced copy of Research 2000’s Pollster Scorecard, which is reproduced below.

The Scorecard finds that Research 2000’s polls were slightly above-average in 2000 and 2002, slightly worse than average in 2004 and 2006, but then distinctly worse than average in 2008 and thus far in 2010, with the exception of the Presidential general election, where they did fine. The Scorecard does not include any results from last night’s primary contests, in which Research 2000 generally did poorly.

Finally, I pointed out to Markos that, while it was admirable that Research 2000 has made the habit of publishing their demographic cross-tabs, the cross-tabs sometimes did not give one more confidence in their results. At the time I in February when I first became more skeptical about Research 2000, they were consistently finding, for instance, that Barack Obama’s favorability rating was in the mid-60’s among voters aged 45-59, but in the low-40’s among voters aged 30-44. And they were repeating this finding week after week after week, even though it was wildly inconsistent with what other pollsters like Gallup had found. In March, when Research 2000 changed the methodology on its tracking poll from a sample of adults to one of registered voters, many of the cross-tabs changed dramatically and they are now more consistent with those produced by other pollsters.

I should be clear that most of these problems are recent. Research 2000 has done polling on behalf of many newspapers for as long as a decade. The quality of their work had generally been fine, although our May 2008 version of the pollster ratings — which rated them as being about average or even slightly better than average — had not looked at their polling of House races, which has always been spotty, nor could it have anticipated the deterioration in quality that they have undergone since then. It’s not especially easy to go from publishing 30 polls per year, in eight or ten states, to a couple hundred per year in every state across the country.

And last but not least, here is a chart containing every poll that I have in my database for Research 2000.

As stipulated above, I do not know that I’m going to able to release my data in full for other pollsters, but I’ve certainly taken the suggestion under advisement.

Principal Sources for Polling Database
————————————–
— Pollster.com (2006-)
— RealClearPolitics.com (2000-)
— PollingReport.com (1998-)
— FiveThirtyEight.com electoral forecasting database (2008-)
— CNN/AllPolitics (1998)
— SurveyUSA Interactive Electoral Scorecard, unlocked version, provided to FiveThirtyEight by Jay Leve (1998-2004)
— Google News, including some paid Archive searches, especially for 2000 and 2004 primaries.
— Harris Interactive press release containing 2000 results, provided to FiveThirtyEight by Harris Interactive (2000-)
— Various live versions of pollster websites (1998-)
— Various archived versions of pollster websites, via Internet Archives (1998-)

Comments