Allocating the Undecideds

Heretofore, I’ve simply been allocating undecided voters 50:50. That is certainly the most neutral assumption to make. But this site isn’t about making the most neutral assumption, it’s about making the most predictive one.

So what I’m going to do instead is tie the undecided allocation to the extent to which Barack Obama overperformed or underperformed his polls in particular types of states in the Democratic primaries. If you compare the actual results in the primaries to the final RCP or averages, you’ll notice some fairly systematic differences.

Specifically, Obama overperformed:

1. In states with high African-American populations;
2. In states that share a border with Illinois (no, Kentucky doesn’t count);
3. In states with highly educated electorates;
4. To a lesser extent, in the South (as indicated by the number of evangelicals), even after accounting for the above variables.

Meanwhile, he underperformed his polls:

1. In the Appalachian states (as indicated by the number of respondents who identify their ancestry as ‘American’, a practice concentrated in the Appalachian region);
2. In states with low education levels;
3. And in states with a high number of Catholics.

This can all be ferreted out via regression analysis, taking the factors I describe above as the independent variables, and Obama’s performance vis-à-vis his polls as the dependent variable. The R-squared on this regression is .72, which is quite high — it means that it was rather predictable when the polls were wrong, and in which direction.

To get a little ahead of myself: does this mean that there was in fact a Bradley Effect during the primaries? It’s not clear. What is actually quite clear — and I’m going to present some research on this over the next several days — is that the polls did a rather poor job of accounting for the black vote. Not only did essentially every “undecided” African-American voter wind up voting for Obama, but some of those who told pollsters they were going to vote for Hillary also wound up voting for Obama. The reverse Bradley Effect, in other words, was fairly manifest.

It’s also clear that there were some patterns in the way that undecided white voters behaved. Number one, a majority of them — probably somewhere between 60 and 65 percent — wound up voting for Clinton. This is perhaps not so remarkable, considering that about 60 percent of white voters in the primaries voted for Clinton period. But, this figure was higher in regions like the Appalachians, and among groups like Catholics, and lower in places where you had a lot of WASPy, educated voters. So whether or not you label this a Bradley Effect, I don’t know — but the behavior of undecided voters has been predictable to a certain extent.

Now, it does not necessarily follow that the patterns exhibited by undecided voters in the primaries will match those in the general election. But based both on my research and on what I’ve been hearing from people on the ground, it’s apparent that the public polling in general is not terrific, and that if we have an instinct about where the polls are more likely to come in high or low, we probably ought to follow it.

So what I’ve done is to transform the results of the regression analysis that I described above into an undecided voter allocation for each state. The allocation is “rigged” such that neither candidate will gain or lose ground in the national popular vote as a result, and such that the range of allocations runs from about .35 to .65. That is, in some states we’ll allocate as much as 65% of the undecided vote to John McCain (and just 35% to Barack Obama) and in others we’ll allocate as much as 65% to Obama (and just 35% to McCain).

The specific allocations follow. Remember, these are based on the extent to which Obama over- or underperformed his polls in various states during the primaries:

Percent of Undecided Votes Allocated to Barack Obama

DC 64.4%
MS 64.4%
GA 63.0%
MD 61.5%
SC 61.1%
AL 60.9%
NC 58.0%
VA 57.8%
IN 57.8%
IA 56.9%
AR 56.8%
OK 56.5%
WI 56.5%
DE 53.7%
AK 53.4%
WA 52.7%
FL 52.4%
TN 52.3%
CO 51.8%
MO 51.6%
MI 51.5%
KS 51.4%
OR 51.0%
LA 50.7%
UT 50.6%
HI 50.5%
MN 50.2%
NE 49.8%
TX 48.3%
IL 48.3%
MT 48.0%
OH 47.2%
NV 46.7%
WY 46.6%
SD 46.4%
AZ 46.0%
ND 45.5%
ID 45.4%
NJ 45.0%
PA 44.8%
CT 44.6%
NY 44.6%
VT 43.7%
KY 43.4%
CA 42.9%
ME 42.6%
NH 42.2%
MA 41.0%
NM 40.0%
WV 38.6%
RI 35.0%

At this point in the election, the number of undecideds is fairly low: generally between 4 and 6 points in each state, once we’ve gotten done assigning a point or two to third party candidates. As such, these allocations do not make a great deal of difference — at the most, a swing of maybe a point or a point-and-a-half.

Still, you can see some impacts at the margins. Take a state like West Virginia, where the polling has been reasonably close but where there are also high numbers of undecided voters. Those undecideds aren’t the type of undecideds who are liable to side with Barack Obama when pushed to a decision, and so the state is not quite as promising for him as it looks on paper. There are also a fairly high number of undecideds in Ohio, a state where we think the undecided vote is liable to break slightly for John McCain. On the other hand, a state like Virginia, where Obama overperformed his polls during the primaries and where some polling has had a relatively generous (and probably false) number of African-American votes going to John McCain, might be just a smidgen stronger for Obama than it appears.

Nate Silver is the founder and editor in chief of FiveThirtyEight.