This post appears in Wednesday’s paper. It is a revised version of a post that was published online Monday. The full version is here.
Even a fairly calm spell in the polling, like the last couple of days, can give people opportunities to see what they want in the data.
The most egregious form of this is cherry-picking the three or four polling results that you like best for your candidate. A vast majority of the time, you can find a couple that are favorable for your side.
If you looked at only the three best national polls for President Obama on Monday, you would conclude that he was three points ahead in the national race. If you looked at only Mitt Romney’s three best polls, you would say that he was ahead by two points instead.
Most people avoid this sort of mistake. It is just too flagrant a case of cherry-picking when 20 polls are published in a day and somebody discusses only two or three of them.
There is a more subtle form of bias, however, that a lot more of us are prone to. That bias is to look at all the data — except for the two or three data points that you like least, which you dismiss as being “outliers.”
If you are a Democrat, for example, and throw out Mr. Romney’s three most favorable polls from the 10 national surveys published on Monday, you can claim that Mr. Obama is ahead in the race by 1.3 percentage points. If you are a Republican and do the same thing, dropping Mr. Obama’s three best polls, you will have Mr. Romney ahead by one point instead.
That is not quite as biased as cherry-picking the best results — but it gets you halfway there, and it is much easier to rationalize. There is something that can be criticized about almost every poll: the methodology, the demographics, the sample size, the pollster’s history or something else.
Often, these critiques have some truth in them. Not all polls are as methodologically sound as others. But frequently people come up with reasons, valid or otherwise, to avoid looking at the polls they don’t like — while giving a pass to those they do.
Likewise, people sometimes make too much of demographic or geographic subsamples within a poll that make their preferred candidate look good. The most recent Washington Post/ABC poll had Mr. Obama performing better in what it termed swing states than in the country as a whole; a recent Gallup poll showed just the opposite.
These subsamples of swing-state voters from national polls are largely useless. A typical national poll may interview 1,000 people, of which perhaps 250 or 300 will live in swing states.
The margin of error on a 250- or 300-person subsample is enormous: about plus or minus six percentage points. (The swing-state sample from the Gallup poll was somewhat larger, but still small compared with the 3,000 or so voters that it interviews for each instance of its national tracking poll.)
In contrast, the state polls that are released on a given day include, combined, thousands of interviews. There is just no reason to focus on what 250 or 300 people say when you can look at what 2,500 or 3,000 do instead.
Heading into the second presidential debate, the FiveThirtyEight forecast still showed Mr. Obama as a modest favorite, with about a 2-in-3 chance of winning the election and just over a 1 percent lead in the popular vote.
But historically, the second presidential debate has moved the numbers by about 2.5 percentage points in one direction or another.
If that gain were in Mr. Obama’s favor, he would re-establish enough of a lead that there would be little doubt about who was ahead.
Another shift toward Mr. Romney, however, and he would probably lead in most national and enough swing-state polls to show him on a path to 270 electoral votes.