A Brief History of Primary Polling, Part I

I’m going to be doing a short series, probably in three parts, on the question of how much we can tell from polls conducted during the very early stages of a presidential primary campaign.

The thesis is that contrary to what you may have read elsewhere, national polls of primary voters — even this far out from the Iowa caucuses and New Hampshire primary — do have a reasonable amount of predictive power in informing us as to the identity of the eventual nominee. That doesn’t mean that these polls are the only thing you should look at, or even necessarily the first thing, but they are a perfectly valid way to do some initial handicapping.

Another part of the thesis is that the polls can become even more useful if we also account for one other quality, which is name recognition.

In the first two pieces, I’m simply going to look at what the polls said about the respective fields for each competitive primary campaign going back to 1972, which is generally taken to be the beginning of the modern primary era (before about 1972, many states did not hold primaries at all, or they were beauty contests). Today, we’ll look at past Republican fields, and then turn to Democrats in the next article.

Specifically, I’m going to consider what the polls said at a comparable point in time to the one we find ourselves in now — early in the year before the primaries began. So, for instance, to evaluate the contenders for 1980, we’d look at what the polls said in the first six months (January through June) of 1979. The polls were gathered by Micah Cohen and me from a number of resources, primarily Lexis-Nexis for the earlier years.

Since Richard Nixon faced only token opposition upon being re-nominated in 1972, our journey for Republicans begins in 1976. This is what things looked like in early 1975, the year before that primary was held.

Several technical points to make about this chart. First, you’ll see some color coding. The yellow highlight indicates the name of the eventual nominee. Candidates whose names appear in blue declined to run for the presidency, even though they appeared in some polls.

As you work your way from left to right in the data table, you’ll first see the candidate’s name, followed by his average standing in each of the polls we were able to track down. Just to the right of that, you’ll see two numbers in parentheses — for example, (2/3). These indicate, respectively, the number of polls the candidate was included in, and the total number of polls for that year. So Barry Goldwater, for instance, was included in 2 of the 3 polls that we identified for 1976.

If the candidate’s name was not included in the poll, we treat this as a zero rather than a “blank” — in other words, he is penalized for this. There are a couple of reasons for doing things this way. First, when there’s uncertainty about whether or not a candidate is going to run, this is a nice way to let the “market” come to a judgement about that — some pollsters will include him while others won’t. Second, this approach produces notably better predictions on the historical data set.

Next, you’ll see a column for “name recognition.” This is simply an estimate of the percentage of primary voters who would have heard of the candidate’s name at this stage of the election.

The best way to ask this question is probably in the way that Gallup does:

“I am going to mention the names of some people in the news. For each one, please tell me if you recognize the name, or not.”

Pollsters should be asking name recognition questions like this one more often than they do. A lot of polls ask for favorability ratings for the candidates, and allow people to “opt out” of the question if they haven’t formulated an opinion of them, but that’s putting the cart before the horse. People may be familiar with a candidate but have ambivalent feelings toward him, or may they feel pressure to provide some sort of response even if they don’t know him from Adam. The better way to do things — as Gallup often does — is to ask about name recognition first, and then ask about favorability conditional upon that question.

With that said, we were able to find some name recognition data, most often from Gallup, for perhaps 80 or 90 percent of the candidates. For the others, I made an educated guess based on factors like whether the candidate had run for the presidency before and the types of offices that he’d held. For instance, an otherwise undistinguished senator or governor will usually start with name recognition of about 30 percent once he begins to make some noise about running for president and gets some early media attention, so that figure would be applied for this type of candidate when we lacked more specific data.

There certainly is some imprecision in my estimates because of factors like the different wording pollsters use to get at the name recognition question — as well the handful of cases in which there was no hard data at all — but in most cases, they ought to be sound estimates — considerably better than rough ones. If you have some evidence that strongly contradicts our estimate for a particular candidate, please feel free to notate that in an e-mail or in the comments section.

The final column is the Recognition Adjusted Poll Average — I suppose you could use the acronym RAPA, but it’s not terribly catchy — which is simply the candidate’s polling average divided by his name recognition. In other words, it measures the percentage of those people who were familiar with a candidate who had him as their first choice. Although this figure tends not to be terribly interesting for the Republicans, you’ll see some cases once we get to the Democrats where it turns out to be quite informative.

Getting back to 1976, we see that there were a wide array of Republicans — everyone from Barry Goldwater to Nelson Rockefeller — who were mentioned as potential successors to Gerald Ford, who was uncertain to run for his own term after taking over following Richard Nixon’s resignation. Once Mr. Ford determined to run, only one candidate, Ronald Reagan, challenged him.

Mr. Reagan, who had about 20 percent support in early polls as compared to Mr. Ford’s 38 percent, came very close to winning his challenge, but ultimately lost it on the floor of the Republican convention in Kansas City.

Mr. Reagan had a leg up on the 1980 nomination, however, which he won fairly easily:

The most serious threat to Mr. Reagan was again probably Mr. Ford, but Mr. Reagan led him in early polls, and Mr. Ford opted not to run. His most vigorous challenge eventually came from George H.W. Bush, who had made little impression on voters early on but won the slot as Mr. Reagan’s vice presidential nominees for his efforts.

Mr. Reagan was essentially unopposed in 1984, so we’ll skip ahead to 1988.

Mr. Bush led in the early polls, although he was no shoo-in, with Bob Dole in particular looking like a serious challenger. Mr. Bush eventually prevailed, however, despite losing to both Mr. Dole and Pat Robertson in Iowa.

The next cycle, 1992, was an unusual one. For a variety of reasons, including the Gulf War and a late primary calendar, the presidential field was very slow to form on both sides; for instance, Bill Clinton did not officially declare for the presidency until October 1991. In fact, we could not find any polls for Republicans in the first sixth months of 1991. So for this year and this year alone, the polls reflect everything in the field from July through December of the year before the primary, rather than January through June.

By late 1991, Mr. Bush’s popularity was waning and he was unpopular enough that he received a primary challenge from Pat Buchanan. The polls suggested that Mr. Buchanan wasn’t much of a threat, and he turned out not to be, although Mr. Buchanan came somewhat closer than expected in New Hampshire, getting 37 percent of the vote there.

The 1996 Republican field was much broader, but Bob Dole had a very substantial early lead in the polls and won the nomination easily, losing only 6 states. The candidate who might have been the most challenging to him, Texas Senator Phil Gramm — who got a decent amount of support in the polls despite middling name recognition — turned out to be a poor retail campaigner.

The next cycle, 2000, also featured a clear frontrunner in the persona of George W. Bush, who had a huge early lead in polls despite being part of a reasonably deep field. Mr. Bush fended off a late surge from John McCain and won 43 sates.

After Mr. Bush won re-nomination without a fight in 2004, it was Mr. McCain’s turn in 2008. He had to come from behind, however, as he trailed Rudolph W. Giuliani in all but 2 of the 68 polls conducted in early 2007.

This is, in fact, the only time in the modern era that the Republican who led in the early polls failed win the nomination — and Mr. McCain was running a reasonably strong second place. Granted, some of these years, like 1992, were only nominally competitive — but overall that’s a pretty darned good track record, and not one consistent with the hypothesis that early polls are meaningless.

But nomination contests have been far more dynamic and far less predictable on the Democratic side, as we’ll see in the next installment.

FiveThirtyEight

A Brief History of Primary Polling, Part I

Comments