How Our UK Forecast Model Works

After last night’s debate, it’s going to be a couple of days before we have a solid handle on how voting intentions in the United Kingdom might have changed, if at all. So, let’s talk a bit more about Version 2.0 about the UK forecast model that we first unveiled on Sunday.

Our model is considerably more aggressive than almost any other in forecasting Conservative and Liberal Democrat pickups from Labour. However the science of UK electoral forecasting is not terribly advanced. The standard method, the called the uniform swing (the idea that the vote shifts by the same margin everywhere in the country), has failed badly in past elections like 1997 in which there was a dramatic shift in voting intentions from one party to another, and may face additional strains in a three-way race like this one. That’s not to say that we’ve necessarily gotten everything right. Perhaps we’ll even be quite wrong. But there also ought to be no particular benefit of the doubt given to uniform swing, whose only apparent virtue appears to be in its simplicity. Indeed, if you look at where people are putting their money, the betting markets are a lot closer to our model than to uniform swing in predicting a shift of seats away from Labour. In fact, the markets are even more aggressive than our model about anticipating a shift from Labour to Conservatives, although we see somewhat better things in store for Liberal Democrats.

The first step in our model is liable to be the most controversial. We construct a matrix of vote transitions from each party to every other — for instance, we might assume that 13 percent of Labour’s vote from 2005 goes to Liberal Democrats this time around, and 9 percent goes to Conservatives:

Whilst the values in this matrix are constructed by hand, they do have some empirical support as many pollsters publish cross-tabular results indicating how voting intentions have changed from 2005 to today. Particularly with respect to the three most important flows of votes — from Labour to LibDems, from Labour to Conservatives, and from Conservatives to LibDems — they tend to provide fairly consistent answers. We impose the additional constraint that the votes ought to “add up” correctly — that is, the votes ought to roughly match current national polling averages after they are taken apart and put back together again.

Note that, in addition to Labour, Conservatives, and LibDems, we have three other “parties”. One represents regional and nationalist parties in Scotland, Wales and Northern Ireland. Another represents all other minor parties, like the UK Independence Party. And the third — new to Version 2.0 of our model — consists of a “party” of nonvoters. That is, voters are permitted to enter and exit the electorate in addition to transitioning amongst the parties. Polls generally show that Liberal Democrats will get the most new voters, while Labour is probably most prone to have problems with turning out their base. In practice, however — while we had initially surmised that our more optimistic results for LibDems were a result of having failed to account for voters entering and exiting the electorate — this does not appear to make much difference as far as the relative standing of the parties.

The second step is to customize these vote shifts to the UK’s 650 individual constituencies. Originally, this step had been simple: we assumed that the proportion of voters changing hands from one party to the next was the same in each constituency. For instance, if Labour loses 13 percent of its voters to LibDems nationwide, we had assumed that they also lost 13 percent of their vote to LibDems in each individual constituency, based on the notional 2005 results.

But this was illogical, at least at the margins; technically, for instance, it had the Scottish Nationalists picking up some of Labour’s lost votes in Central London. We now have devised a more complicated method wherein, although the proportion of the total vote lost is the same in every constituency, which party those votes are assigned to can vary. In particular, they vary based upon the relative proportion of the vote realized by the parties in 2005. So, for instance, in a constituency in which LibDems had finished in second place to Labour in 2005, they will tend to get more of Labour’s lost votes, whereas if Conservatives had finished second instead, a comparatively higher proportion of the ex-Labour voters will be assigned to them.

The math on this gets modestly complicated, since the vote transition percentages still need to add up correctly at the national level. That is, on average Labour still needs to lose (for instance) 13 percent of their voters to the Liberal Democrats. But in places where the LibDem’s share of the non-Labour vote had been higher, this percentage will be proportionately higher (perhaps it could be 20 percent). Likewise, it will be lower in places where LibDem’s standing had been relatively inferior (perhaps it would just be 7 percent). No party will pick up votes in a constituency in which they had no presence in 2005.

The net effect of this is something resembling what is normally called tactical voting: that is, that the more competitive parties (like the party which finished in second place in the previous election) will tend to benefit more than the less competitive ones. Although we had some debates about how if at all to implement an adjustment for tactical voting, ultimately I (Nate) decided that this more holistic and mathematically elegant approach was preferable to an ad-hoc adjustment, i.e., designating certain constituencies as marginal districts as being subject to a different set of rules. My preference on this was determined partly by the fact that voters rarely claim to be voting tactically in surveys, and may also lack awareness about the tactical implications of their vote. While undoubtedly some voters do vote tactically, my impression is also that the phrase “tactical voting” has somewhat too much currency is often used as a band-aid to cover up flaws with the uniform swing model. It’s easier to say — see, the voters are smart, they’re voting tactically! — than to acknowledge that the uniform swing is dumb, which is arguably more the reality.

Further note that there are some interesting implications in our method owing to the fact that we treat nonvoters as a separate “party”. In a hypothetical district where Labour had 100 percent of the vote in 2005, for instance, they will still have 100 percent of the vote in 2010. In a literal sense, the model assumes that Labour still do lose votes — but these voters simply drop out of the voting pool rather than transitioning to one of the other parties, producing lower turnout but Labour still with 100 percent of the voters who do make it do the polls.

Most of the heavy lifting is done at this stage, but we have a couple of additional wrinkles to give the model additional accuracy.

The third step is an adjustment for retiring incumbents, of which there are a lot this year: the Labour party in particular has had around 100 MPs volunteer to give up their jobs. The incumbency advantage is not terribly large in the UK, but we found based on regression analysis that Conservatives and Labour underperformed by about 1.5 points in constituencies where they had a retiring incumbent, and Liberal Democrats and other parties by 3.0 points. Thus, Labour and Conservatives received a penalty of 1.5 points, and the other parties 3 points, in a constituency where they indeed had a retirement, with their lost votes transitioned proportionately to the other parties. On the other hand, because the cost of those retirements has presumably been “priced into” the national polling averages that we’re attempting to calibrate to, the scales are counterbalanced by giving a slight bonus to nonretiring incumbents. On a net basis, this adjustment winds up hurting Labour by only a couple of seats, in spite of their very large number of retirements.

The fourth step is a regional adjustment. We use polling results from the 11 regions as defined by Populous Home/YouGov, and make an additive or subtractive adjustment where our method appears to be overestimating or underestimating the vote in a particular region. The adjustment is carefully calibrated so that no party loses or gains votes nationwide as the result of the regional shift: any vote they lose in one region must be won back in some other region.

The most important implication of the regional adjustment is in Scotland, which often disobeys national trends and appears poised to do so again this year. In fact, polls show that there has been very little change in voter sentiment in Scotland since 2005, in spite of tectonic shifts elsewhere — Labour’s vote has dropped hardly at all, for instance. This is actually bad news for Labour, because Scotland is a place where they have a lot of very safe seats (although LibDems or nationalist parties are competitive in some constituencies). And if Labour aren’t losing votes in Scotland — a place where they could afford to lose them — that means they are losing votes in other places where the results are more marginal. However, the net effects are rather minor. If we turn the regional adjustment off, we get a projection of Conservatives 304, Labour 201, LibDems 113, as opposed to our current figure of Conservatives 299, Labour 199, LibDems 120. Although other models posit larger impacts from regional voting patterns, this may reflect the inadequacies of the assumption of a uniform national swing rather than anything in particular having to do with the regions themselves (Scotland excepted). That is, most of what the regional adjustment is trying to accomplish, our model has already “solved” in step two where we depart from uniform swing; this additional step serves mostly for redundancy and to adjust for any impacts from Scottish voting patterns.

The fifth step and last one is exceedingly minor: we disaggregate the results from the regional and minor parties, based on the proportion of the vote they received in 2005, so as to ensure that we don’t assign a seat to a minor party when the total vote share for the minor parties exceeds that for any of the major parties, but no individual minor party does.

That’s it; from there, we simply see which party is projected to receive the most votes in each constituency and tally up the results.

So, will it work? It’s a relatively elegant model, which there’s something to be said for, and it embodies what I believe to be sound logic about voter behavior. Its results square better with my intuition — it’s hard to imagine that Labour could lose one-third of its votes (from 36 percent to say 24) and not be severely injured — as well as with the intuition of the betting markets. And we have a relatively low bar to clear because there’s no evidence that uniform swing is much good at all. Perhaps an advanced version of uniform swing, like the PoliticsHome version that accounts for Scottish voting, makes an ad-hoc adjustment for tactical voting, and treats its estimates probabilistically (something our model does not do) would be a good hedge.

However, our method is not entirely experimental; there is some evidence that it better resembles historical voting patterns. Here, for instance, is a comparison of the Labour vote share in 1992 to that in 1997, the last time that there was a major shift in voter preferences (Labour jumped from 34 percent of the vote to 43).

Note that the vote shift is not quite linear but forms something of a convex curve; Labour gained relatively less in seats where they were already dominant, and in seats where they had little presence, but relatively more in those where they were somewhere in the middle. Those in-between seats are the ones most likely to change hands, which is why Labour significantly outperformed their uniform swing projection that year. We show an almost-identical pattern in our forecast for the Liberal Democrats:

Finally, bear in mind that our model is very sensitive to small changes in voter preferences. It could wind up looking good for the wrong reasons if its method for allocating votes to seats is flawed but this is counterbalanced by inaccurate polling which pushes in the other direction; the opposite of course is also true. We’ll continue to publish multiple scenarios from the model as a robustness check.

–Nate Silver

The FiveThirtyEight UK forecasting model was developed by Nate Silver in conjunction with Renard Sexton, and Daniel Berman.

Comments