Nerdfight: UK Election Model Methodology

Warning: The following will only be of interest to about 0.01 percent of you. But if you’re really into model-building, psepheology, or the UK election, or just want to see two nerds going at it, read on!

Over at Pollster.com, Robert Ford of the University of Manchester has a lengthy critique of our UK election forecasting model. I am not quite sure what triggered this, although Ford and his colleagues have developed their own model, one for PoliticsHome that depends on a variant of the uniform swing hypothesis, that I suppose we’re competing with.

Unfortunately, the critique is not very well thought out. It makes a number of factual errors and unproven assertions, and leaves one with as many questions of Ford’s methodology as of mine. I am going to dissect Ford’s critique — and counter-critique his model — on a paragraph-by-paragraph basis.

Empirically, there is little support for Nate Silver’s conception of proportional swing, as shown in this recent paper by my colleague David Voas.

If people actually click on that link and read the Voas paper, they will find that there is no less support for my hypothesis as for theirs. The paper points out that the large swing to Tories under Margaret Thatcher in 1979 was modelled extremely well by uniform swing. But, the paper also points out, uniform swing would have been a very poor assumption in a more recent “change” election — that in 1997 — when the improvement in the Labour vote was not linear but curved and would have been substantially better represented by a model like ours. The paper further points out that there were some systematic deviations from uniformity in the two other recent elections which it studies — 2001 and 2005. It is an extremely equivocal paper and simply does not endorse uniform swing over my approach to any degree whatsoever.

There is no evidence of larger swings in recent elections (including 1997) where parties start off more strongly. There is some evidence that swings are larger where the parties are competing more closely, but in our view Nate’s model is a poor way to capture this dynamic.

The first statement is clearly wrong. In 1997, Conservatives lost an average of 14 percent of their vote in the 46 constituencies in Great Britain in which they began with 60 percent or more of the vote, but 7 percent in the 57 constituencies in which they began with 25 percent or less.

[The PoliticsHome] model also incorporates systematic differences in swing suggested by the polling data. We anticipate stronger Conservative performance in the marginal seats where they are competing directly with Labour by allowing an extra 2 points of swing to them in such seats.

This is undoubtedly the weakest facet of their model: they give Conservatives 2 bonus points in cases where the Labour majority was between 6 and 14 points in 2005. Why 2 extra points of swing and not 1 point, or 3 points, or 5 points? Why are the goalposts for this adjustment set between a margin of 6 and 14 points, and not 8 and 12, or 4 and 16? Why the sharp edges — there is a substantial bonus given to Conservatives if Labour won by 6.001 percent of the vote in 2005, but none at all if they won by 5.999 — and not the curvature that the world actually obeys? This adjustment appears to be completely arbitrary and undermines their pretense of being more firmly grounded in the emperical evidence.

More importantly, why is this adjustment applied only in the case of Labour-Conservative marginal seats, but not in marginal seats involving the Liberal Democrats? Note that our respective projections of the number of Conservative seats are almost exactly the same; we have Conservatives at 299; they have them at 289. The difference is that we have the Liberal Democrats winning a few more seats from Labour. It seems plausible that, had Ford done LibDems the favor of applying the same arbitrary adjustment for them that he did for the Conservatives, his results would be essentially identical to ours — in which case there would be no need for me to be spending my Saturday morning writing about modelling methodology.

Nate also makes a variety of adjustments of this kind, but his changes are not as well grounded in empirical evidence from the polling data. Firstly, the transition matrix he applies to vote shares is based upon a weak evidence base – while pollsters provide details of respondents’ recalled 2005 vote, the transition matrices calculated from this are subject to bias due to respondents’ tendency to misremember their votes – in particular remembering voting for the winning party when they did not. This phenomenon is well established, and British pollsters attempt to correct for it in their weighting. However, any model which uses transitions in vote from polling data is likely to overestimate the extent of switching from the current governing party to opposition parties, because many people who say they voted for the governing party last time did not actually vote for them. We suspect this may contribute to Nate’s high estimate of change from Labour to the opposition parties.

We need not be as naive as Ford implies about the way that the vote might shift from one party to the next. The fundamental characteristic of this election is that Labour will get a lot fewer votes than last time around, and Liberal Democrats will get a lot more. (The Conservatives will probably also do somewhat better, but the difference is slight). There are only three ways that this might occur:

1. If a large number of voters shift from Labour to LibDems.
2. If a number of voters shift from Labour to LibDems, some other voters shift from Conservatives to LibDems, and a third group shifts from Labour to Conservatives (to counterbalance the votes that Conservatives lost to LibDems).
3. If no votes actually change hands, but the shifts in the parties’ relative standing are determined entirely by voters enterring or exiting the electorate.

Let’s see how each of these assumptions would play out in my model.

First, we’ll test a version of Theory #1 by assuming that 25 percent of Labour’s voters shift to Liberal Democrats, and there are no other changes of any kind in the electorate. This would produce an popular vote result of Conservatives 33.2, LibDems 31,7, Labour 27.1, excluding Northern Ireland. Under this assumption — I’ll turn off our regional and incumbent adjustments so that they do not cause any distraction — our model would have Conservatives with 276 seats, Labour with 211, and LibDems with 130:

Next, let’s see how our model would handle Theory #2. In this case, we’ll assume that Liberal Democrats pick up 13 percent of the vote from Labour, as well 13 percent of the Conservatives’ vote. Conservatives’ losses to LibDems are counterbalanced by a gain of 12 percent of Labour’s vote. This leaves the overall percentage of the vote for each party exactly the same as under Theory 1: Conservatives 33.2, LibDems 31,7, Labour 27.1.

Under these assumptions — which are closest to those employed in my model and closest what the polling evidence actually shows — LibDems do a bit better, getting up to 138 seats rather than 130, and Labour does a bit worse. But the differences, as you can see, are rather minor.

Finally, let’s check Theory #3. Once again, we’ll require the overall performance to be the same: Conservatives at 33.2 percent, LibDems at 31.7 percent, Labour at 27.1 percent. But we’ll achieve this entirely by adding voters to LibDems from the nonvoter pool, and by subtracting votes from Labour into the nonvoter pool. Under this scenario, the number of LibDem (138) seats is the same as under Theory #2, and the Conservative and Labour numbers are little changed from the other two cases.

The point of this exercise is that our model is quite robust as to the question of exactly who’s votes go to whom. It can make some difference at the margins, but it only amounts to 5 or 10 seats, provided that the nationwide vote share is held constant. The “transition matrix” that determines who gets whose votes is not applied in a vacuum but instead is significantly constrained by the fact that the transitions must balance out in such a way so as to closely replicate the national polling averages.

Secondly, the changes Nate makes for regional differentials in swing are based on polling data that is two years old and was collected in a very different political environment to the current one – the Conservatives were a long way ahead in the polls while the Lib Dems were far below their current tally. We considered incorporating regional swings based on this data, but rejected the change due to the age of the data. We incorporate changes for Scotland as we have a good evidence base from Scotland specific polling, which is regularly updated.

This is a bizarre statement: the data I use on regional voting patters (data, ironically, taken from the PoliticsHome website) is not two years old, but rather only a week old, and entirely postdates the LibDem surge that began after the first Prime Ministerial debate.

I am agreed that it makes little diffence whether a regional adjustment is applied other to account for Scotland; apropos of that assumption, if we kept the regional polling for Scotland but zeroed out all the rest, it would make almost no difference in the seat estimates our model produces.

We do not attempt to model “tactical voting”, or the effects of incumbent retirements because we simply do not have good quality, recent data on the pattern or level of such effects. Our own regression analysis of incumbent effects did not reveal robust effects of incumbent retirements in recent elections, so we are rather surprised to learn that Nate has uncovered some. Modelling effects such as these, where the statistical evidence is weak requires making strong assumptions. We prefer not to make such assumptions, sticking only to effects where the evidence base is very strong.

The data I evaluated on the incumbency effect was from the 2005 election, as per Pippa Norris’s dataset. In this case, there are clear and statistically sigificant effects resulting from incumbent retirements. For example, if one performs a simple linear regression on Labour’s vote share in districts where they were the incumbent party in 2005, where the independent variables are the standings of the three parties in 2001 and whether the incumbent retired or not, the retirement effect is significant at the 95 percent confidence level. The same is true for Conservative retirements. The effect for Liberal Democrats is not statstically significant at the 95 percent threshold, but it is at the 90 percent threshold, and its magnitude is larger:

Perhaps the effect would disappear if I examined elections prior to 2005 — Norris’ dataset only included information on incumbency for the 2005 election — but the incumbent effect, although fairly small in magnitude, was quite unambiguous last time around. And it’s hardly radical to suggest that incumbency is an advantage; its advantages have been observed among a wide number of countries with a wide variety of voting systems from the dawn of political time.

On top of our votes to seats projection, we also make efforts to develop a robust estimate of current public opinion. Nate freely admits that his public opinion figures are “educated guesses based on recent cross-tabular results”. We employ a state space model to estimate current public opinion every few days, while controlling for systematic “house effect” differences between the pollsters and differences in the sample sizes they employ in their polls. The polling data inputted into our model is therefore based on a more systematic aggregation of available public opinion, although to be fair our current estimate of public opinion is quite close to Nate’s.

I have not concentrated on calibrating the national polling averages to adjust for things like house effects and pollster reliability. It is absolutely a valuable exercise to do so; this is, after all, much of the basis for our U.S.-based election forecasts. But in this instance, it is a far less important (and interesting) question than determining whether uniform swing is in fact a valid model for projecting a seat count, which as we’ve demonstrated can easily make a difference of 60 or more seats in one’s projection. (Imagine that we were trying to forecast the electoral vote count in the United States but there were only national polls avaialale and no state-level ones; this is essentially the situation in the United Kingdom. What would be more important: to figure out exactly what Rasmussen’s house effect is? Or to tackle the more fundamental question of how national polls might translate into the electoral college?)

Besides, there’s an easy way out of this: we can steal their method of coming up with a polling average — which looks very sound to me — but then let our model do the work of translating that polling average into a seats projection, where I believe it is stronger. Here is what we get when we do that:

With their polling average but our method for forecasting seats, we come up with 308 seats for Conservatives (rather than their 289), 115 seats for LibDems (rather than their 98), and 195 seats for Labour (rather than their 231).

To sum up, we believe our model has a stronger basis in existing analysis of UK voting patterns, and is based upon techniques that were employed successfully in 2005.

Of course, this election is perhaps the most difficult to predict since polling began in Britain, and it may be that uniform swing fails miserably, and that proportional swing of the form Nate proposes manifests strongly next Thursday. We prefer to navigate these uncharted waters with tried and tested methods as a guide, Nate suggests a radically new environment requires radically new methods. We will all know for sure in a week!

I agree that there is a lot we don’t know, and that either approach could prove to be more accurate. However, it is probably not correct for them to assert that their model is more solidly grounded in the empirical evidence. The paper which they cite as evidence that uniform swing is more accurate in fact says nothing of the sort. Moreover, the primary way in which their model deviates from uniform swing is by means of a completely arbitrary adjustment that assigns extra credit to Conservatives in certain marginal seats (they strangely do not make the same adjustment for Liberal Democrats). Had they a more elegant and emperically-grounded way to do this, their argument would be much stronger.

To be clear, accuracy ought to be the paradigm here. We’re not trying to prove or disprove anything to an academic certain degree of certitude; we’re trying to make a forecast. But even as a theoretical matter, I am not sure that uniform swing is a more conservative approach than mine so much as it is merely a more traditional one. It too makes a “strong assumption” of the type they claim to avoid: that the voting patterns in this year’s election will strongly resemble the last one, plus or minus some constant. Arguably the most neutral assumption (although obviously a naive one) would instead be that the seat count ought to mirror the vote count transformed by some function — so that, for instance, if the Liberal Democrats finish with more votes than Labour, they ought to earn more seats. Instead, uniform swing is itself an hypothesis, and an unproven one. From the standpoint of asethetics, it is a hypothesis with the virtue of simplicity, but the failing of inelegance (for instance, in the way in which it can assign negative votes to parties under many circumstances). As a practical matter, it has performed well in some elections but failed badly in others, especially recently.

In any event, the fact that two reasonable approaches can produce such radically different results is fascinating. This is rare. A lot of the time, you’ll go through a lot of work to build a fancy model, and its results will barely diverge from a simpler one. That is not a concern here! Even in its simplest form, my model produces radically different results from a basic uniform swing calculation.

What people ought to do when this sort of situation presents itself is to look under their hood: open up their spreadsheets, re-examine their assumptions, and test their models for robustness. After initially releasing our model six days ago, we did exactly that, making several refinements to our methodology and finding that they did not contradict our initial assertion that this election implies substantially more downside for Labour, and substantially more upside for Conservatives and Liberal Democrats, than simpler forecasting techniques would suggest. Ford’s critique, on the other hand, does little to advance the science. Our model may be wrong, but I’d rather fail by being too ambitious than too stubborn.

FiveThirtyEight

Nerdfight: UK Election Model Methodology

Comments