Some of you may want to bypass this post, but in the interests of full disclosure:
I’ve made a couple of improvements to the regression model that underlies the analysis. The first adjustment is to weight the regression based on the depth of polling data that we have in a given state. Without this weighting, the regression would treat a state like Wyoming, where we have just one poll, as having as much influence over the model as a state like Pennsylvania, where we are already approaching a dozen. Among other things, this should allow the model to “read-and-react” more quickly to new polling data.
The second improvement is to consider a couple of new variables in the analysis: the percentage of the 2004 electorate that identified themselves as Democrat, Independent, and Republican in 2004, according to CNN exit poll data. Obama does comparatively worse in states where a larger share of John Kerry’s vote came from self-reported Democrats, and better where more of his vote came from Republicans and Independents. This is consistent with a finding from the recent Pew Poll, which shows Obama losing more self-identified Democrats to McCain than Clinton does, but getting a larger fraction of the vote from Republicans and Independents. This tends to give the model more confidence in Obama’s polling lead in a state like New Hampshire, which has a huge number (44% of the electorate) of self-reported Independents, while harming him in a state like West Virginia, where just 18% of the electorate identify as Independent (but 50% identify as Democrat).
The overall effect of these adjustments is to slightly hurt Obama’s win percentage, as he loses a few percentage points in industrial states like Pennsylvania that have relatively few independents (20% Independent, 41% Democrat in 2004). Clinton’s numbers have moved up a tiny bit.
To get even more technical, the way the regression model is programmed is to consider eight potential variables:
- The Kerry-Bush margin in 2004.
- The percentage of Baptists (all Southern Baptists, plus 1/2 of non-Southern Baptists)
- Obama fundraising (dollars raised per 2004 general election voter)
- Clinton fundraising (” “)
- McCain fundraising (” “)
- The percentage of African-Americans in the population
- The percentage of self-identified Democrats
- The percentage of self-identified independents
The model discards any variables that are not statistically significant at the 85% confidence level and retains the rest. The variables presently included in the regression model are as follows:
Obama Clinton
Variable Coeff. t-score Coeff. t-score
Kerry .549 6.20 .714 13.65
Baptist -.261 -2.74 .381 4.00
$_Obama 6.708 3.60 DROPPED
$_Clinton DROPPED 4.627 3.82
$_McCain -9.421 -2.92 -6.236 -2.34
Af-American DROPPED -.173 -1.56
Dem% -.560 -3.09 DROPPED
Ind% DROPPED DROPPED
Constant 24.561 3.78 -3.282 -2.77