OK, so in the few hours since I published this earlier post on the bizarre South Carolina Senate Democratic primary, things have been developing rapidly, and on a variety of fronts. I’ve received several emails from people I trust and respect, and the Vic Rawl campaign has now issued a statement about a study conducted by a political scientist and statistician I very much respect. For lack of a better way to unpack all of this, I’ll just bullet point what I’ve learned or read, and then try to pull it all together in a summary at the end.

1. Non-Democratic irregularities. OK, first, a very quick development which my colleague Michael McDonald of George Mason, who is one of the leading experts on American voting turnout, brought to my attention. Dr. McDonald–with whom, for full disclosure, I recently co-wrote a forthcoming book chapter about voter turnout and mobilization during the 2008 presidential race–emailed to point out that in South Carolina the weird stuff is not limited to the Democratic side of the aisle because there are “three counties with more votes cast in Republican governor’s race than reported turnout in the Republican primary.” He said there may be more but the GOP gubernatorial primary “is the only race I’ve looked at so far other than the Democratic Senate race.” Those three are Darlington, Horry and Marlboro, and there are two others, Bamberg and Fairfield, with zero residual GOP votes (i.e., the total number of GOP voters in the county is identical to number cast in the GOP gubernatorial), which McDonald informs me is very, very rare.

2. Some (albeit very weak) racial pattern to Greene’s vote share? “Jeffmd” at Swing State Project, which is run by DavidNYC, a blogger whose identity I know and whose site and bloggers are reliable, has a very detailed post in which he finds, at the precinct level, a stronger relationship between race and Greene’s vote share. This analysis is similar to the two graphs produced by a very smart commenter on the Monkey Cage post by John Sides I linked in my previous post. The main point of both of these analyses is that the effectively zero relationship between race and Greene’s performance that Sides and I independently found at the county unit of analysis is a bit less non-zero (and specifically positive) at the precinct unit of analysis–but still very weak. So, there may be a very slight racial component to the voting pattern, which brings us next to….

3. The Green(e) = Black theory. Jeffmd’s post also raises the possibility of something fellow 538er Ed Kilgore mentioned to me, and which Walter Ludwig, Rawl’s campaign manager, had also heard discussed by some in South Carolina: That the specific spelling of Greene, with the “e” on the end, is recognized by many to be a surname spelling common in the African American community. I’m not sure if that’s universally true, although the only other person I know who has “Greene” as a surname is also black; and of course the absence of an “e” at the end does not necessarily imply white or even non-black: Texas has two Congressmen named Green, neither with the “e” on the end, and Gene Green is white but Al Green is black. Whatever the case, one can understand how this could serve as a weak clue imparting some information about an otherwise unknown candidate, just as there are certain patterns in Jewish, Hispanic, Italian, eastern European and many other surnames. It also squares the circle of the possible paradox of there being some racial pattern despite the vast majority of voters knowing nothing or next to nothing about either Greene or Rawl.

4. Revenge of Benford’s Law. As regular 538 readers know, our otherwise soft-spoken leader Nate Silver carries a big statistical stick and used it earlier this year to cudgel Strategic Vision polling firm by showing that their results had unusual digit patterns which strongly suggest the results were simply made up. Similarly, the Rawl campaign has now issued a press release reporting the findings of separate inquiries performed by two respected electoral forensics experts, Dr. Walter Mebane of the UMichigan and Dr. Michael Miller of Cornell, neither of whom is affiliated with the campaign. Here are the key graphs of that press release:

Dr. Mebane performed second-digit Benford’s law tests on the precinct returns from the Senate race. The test compares the second digit of actual precinct vote totals to a known numeric distribution of data that results from election returns collected under normal conditions. If votes are added or subtracted from a candidate’s total, possibly due to error or fraud, Mebane’s test will detect a deviation from this distribution.

Results from Mebane’s test showed that Rawl’s Election Day vote totals depart from the expected distribution at 90% confidence. In other words, the observed vote pattern for Rawl could be expected to occur only about 10% of the time by chance. “The results may reflect corrupted vote counts, but they may also reflect the way turnout in the election covaried with the geographic distribution of the candidates’ support,” Mebane said.

Dr. Miller performed additional tests to determine whether there was a significant difference in the percentage of absentee and Election Day votes that each candidate received. The result in the Senate election is highly statistically significant: Rawl performs 11 percentage points better among absentee voters than he does among Election Day voters. “This difference is a clear contrast to the other races. Statistically speaking, the only other Democratic candidate who performed differently among the two voter groups was Robert Ford, who did better on Election Day than among absentees in the gubernatorial primary,” Miller said.

These findings concern the campaign, and should concern all of South Carolina.

Indeed: An unusual, non-random pattern in the precinct-level results suggests tampering, or at least machine malfunction, perhaps at the highest level. And Mebane is perhaps the leading expert on this very subject. Along with the anomalies between absentee ballot v. election day ballots Miller found, something smells here. (You can find out more about Mebane here, and this is a helpful paper of his about second digit Benford Law, or “2BL,” voting irregularity patterns.)

5. The Republican crossover theory debunked. In addition to many smart comments from 538 readers to the previous post on the SC race, I received an email from one particularly astute reader named Harrison Brown. Complete with an excel spreadsheet to back up his conclusions, Brown basically argues that there’s neither any logic to, nor statistical evidence to support, the idea of Republicans crossing over to infiltrate the Democratic primary. Here are the key sections from his email to me, verbatim:

1. Suppose people were being brought into the Democratic-primary voting pool (from unregistered voters, the Republican faithful, or wherever) for the sole purpose of voting for Greene. Imagine a variable encapsulating the proportion of primary voters in each county who are Greene partisans; this (hidden) variable ought to be strongly positively correlated with both Greene’s final results and with the participation rate in each county. In particular, this implies that Greene’s vote share and the participation rate, both of which we can measure, would be correlated. But this is not the case — under either linear or rank correlation! The R-squared and rho-squared are both effectively 0.

2. Even if that effect didn’t show up, there should still be other signs. For instance, we can see if there are any counties where turnout for the Democratic primary exceeded the number of votes Barack Obama received in 2008; those would be prime suspects for Republican influence. And, in fact, there are three such counties: Hampton, Lee, and Union. But these are all fairly small counties where McCain/Palin received under 30% of the vote — hardly Republican-dominated…

A more robust analysis of turnout levels reveals similar patterns. Although I didn’t collect data for Republican voters (except for the McCain vote share), I came up with a rough estimate of GOP voters in 2008 by assuming the two-party share was 100% in each county. Running a linear regression to predict the number of Democratic primary voters from the number of votes Obama and McCain received, we find that the McCain raw vote total is statistically significant–but it has a negative coefficient. If anything, this points to voter suppression (no real surprises) rather than ballot box stuffing.

3. Finally, there’s the simple question of where the Republican voters would have come from! From eyeballing the GOP primary totals, it seems like turnout in that elections was almost ludicrously high, which seems more-or-less corroborated by what Google’s told me. But barring widespread voter fraud and/or corruption by local election officials, high turnout in the GOP primary should be incompatible with infiltration into the Democratic primary.

In conclusion, while the voting patterns in the D-Senate primary are strange and may not be totally legitimate, they don’t bear the expected hallmarks that would arise in the case of a Republican plant.

With all that now added to the record, so to speak, how does the matter now stand?

Well, I think it’s safe to say that the third possibility I raised in the previous post–GOP cross-primary infiltration–can be eliminated. There doesn’t seem to be any direct or circumstantial evidence for that, and there were sufficient motives to participate in the very contentious GOP gubernatorial primary (especially with Nikki Haley running). So we can almost certainly eliminate the idea that there was a coordinated GOP effort to get Republican and/or conservative voters to pick up Democratic ballots with the intent of selecting Greene as DeMint’s general election opponent.

That leaves what I think are now two scenarios:

A. The first is a combination of the first and second possibilities of my initial post: Greene was a nobody, but Rawl was darn near close to a nobody, and thus Greene’s alphabetical ballot position, coupled with whatever signal the spelling of his surname sent to some African Americans that he might be (and in fact is) an African American, with a dash of Rawl’s high disapproval among the 18 percent of survey respondents who had heard of him, combined to take what in theory might otherwise have been a 50/50 split among two broadly unknown candidates and turned it instead into a 59/41 race.

B. Somebody with access to software and machines engineered a very devious manipulation of the vote returns–but not so devious that he/she/they were unable to cover the tracks of the digit patterns in those results.

UPDATE: The second commenter to this post, along with a variety of commenters to the previous post and several analysts, have all posed this question about vote-tinkering: Why would the GOP or DeMint or conservatives bother to do so in this race? The assumption is that DeMint will cruise. And he probably will. But given that he was expected to run against a virtual unknown in Rawl, DeMint’s head-to-head numbers were pretty dismal in this (presumably internal) poll, and put him not entirely out of reach even in this PPP poll taken a week or so before the primary. So I’m not entirely sure DeMint, though very safe, was a lock to win re-election.

