Nerdfight: Episode IV — Return of the Tories

We’re getting into a pretty darn interesting back-and-forth with Robert Ford of the University of Manchester and his team of three colleagues who have developed a UK elections forecasting model for PoliticsHome; our own model is described here. I recognize that these discussions are quite technical, but this is the sort of thing that we used to do a little bit more of at 538 and which is always going to be at the core of our brand.

To bring you up to speed, Episode I (Robert’s critique of our model) is here, Episode II (our defense and counter-critique) is here, and Episode III (the counter-counter-critique) is here. If you’re going to read this post, I’d strongly encourage you to read the other three first (think of this as an episode of Lost with no helpful flashback/medley sequences to catch you up).

There’s plenty that we agree on here.

First, I think we’re all agreed that this has been a really vaulable discussion to have. Frankly, when we posted the beta version of our model last Sunday and it showed radically different results from uniform swing, I was disappointed that it hadn’t inspired a bit more debate. If things have gotten a little bit heated at times … well, It’s The Internet. That Happens. We’re on good terms with Rob and his team and hope to catch up for a pint sometime when our paths cross in the future.

Second, I think that they’re right that our incumbency adjustment is dodgy. I included it simply because when we tested it on the 2005 election — the only data we had handy — it came up as being quite robust, and it seemed perfectly natural to me coming from an American context where the incumbency effect tends to be quite large. But if it hasn’t shown up in prior recent elections, or if incumbency had in fact been a disadvantage in some of them, that’s enough for us to remove it from our model, as we will do going forward. I also tend to agree with Rob’s point that in this particular election, with various tangible and intangible signs of anti-incumbent sentiment, it’s especially dubious to assume that incumbency will be advantageous.

Thirdly — we both recognize that we’re operating in something of a information vacuum here. Leadership elections are always hard to study in that they only occur every so often (exactly once every four years in the United States and about once every four years in the United Kingdom). Moreover, each election may be significantly different from the ones that have preceeded it. In this case — an election which looks as though it will feature both a major shift in votes away from one of the two main parties (as in 1997 or 1979) and a major surge by a third party, the effective sample size from past UK elections is basically zero. We’re literally in uncharted territory, and so claims that such-and-such model would have performed better on such-and-such occasions should be viewed skeptically. Moreover, while the differences between uniform swing and proportional swing models have generally been fairly small, they are quite dramatic here, which alone attests to the uniqueness of the election.

Fourthly, I’m agreed that it’s an annoying facet of our model that we have to fill in the ‘transition matrix’ (whose votes go to whom) by hand. I don’t think this is necessarily a disadvantage insofar as accuracy goes: there are times when a more robust, but more subjective method (especially one which is mindful of the objective evidence) will beat a method that is more objective but less nimble. This may be one of those times or it may not. Nevertheless, it’s a disadvantage that we can’t just plug-and-play with our model: we have a lot of decisions to make each time that we run an update.

With all that said, I think there are a couple of points where we have some authentic disagreement. One concerns the intellectual foundations of the different models. To be very blunt, while I appreciate having been pointed to the academic literature on this, I find some of the discussions to be abstract to the point of being confusing. One thing that I think should be mentioned is that, in the literature, the discussions aren’t usually framed in the context of why elections should follow uniform swing, but rather, why they could follow uniform swing, particularly given that uniform swing is arguably somewhat counter-intuitive. They are more plausible defenses of uniform swing than critiques of other models.

Another, more particular problem is that evidence rooted in discussions about the number of ‘swing’ or ‘floating’ voters is actually rather ambiguous insofar as its implications for model selection. Let’s take a greatly simplified case in which there are just two parties — Red and Blue — and three seats in a particular country: a safe Red seat, a safe Blue seat, and a marginal seat. Suppose that Red has grown unpopular, and we anticipate under a proportional swing method that they will lose 20 percent of their voters from the previous election to Blue (nobody switches the other way from Blue to Red).

Would we expect to see more people having switched their vote in the marginal seat? Not really. Under a propotional swing approach, the seat with the greatest number of switchers will actually be the Safe Red seat. The Safe Blue seat will have the fewest switchers. And the marginal seat will be somewhere in between, about at the average of the other two:

Of course, when you have voters switching between multiple parties, some against-the-current switching (e.g. some voters going from Blue to Red in contradiction of the general trend), and voters shifting into or our of the electorate — all of which happens in real life and all of which is assumed on our model — things get a lot more complicated. But evidence, for instance, showing that there were about the same number of vote-switchers in marginal seats as in safe ones does not particularly cut against proportional swing; it may in fact be quite consistent with proportional swing, depending on what assumptions one makes.

In general, one thing I like about our approach is that it allows us to deal a bit less abstractly with these questions. We identify some particular fraction of voters that we expect to switch from Red to Blue, or Blue to Red, or Red to Yellow, or Yellow to nobody, and the projection reflects the summation of these decisions. Under uniform swing, we only see the result of the decisions, but we do not know how we get there. For instance, is Yellow winds up with more votes than the previous election and Red fewer, it could simply mean that a lot of voters have switched from Red to Yellow. But there could also be (and there will be in the real world) more complicated traffic patterns: some voters switching from Red to Yellow, but also some switching from Red to Blue, and some switching from Blue to Yellow, and so forth, in order to produce that result.

***

The other dispute concerns the approach that Robert’s team uses in marginal constituencies, in which they give 2 extra points of ‘swing’ to Conservatives in certain Labour-Conservative marginals. I have argued that this is arbitrary. Robert has said that it has some basis both in theory and in fact, such as in polling conducted of groups marginal districts.

Mind you, I don’t disagree that Conservatives do potentially stand to overperform in these marginal seats. I just prefer a more organic approach to achieve that result. Even if we knew, for instance (such as from polling), that Conservatives would peform 2 points better on average in certain marginal seats, this would probably not look like a step function (as their model implicitly assumes) but instead as some kind of curve (see below), where their bonus was somewhat larger than 2 points in the most marginal seats but less than 2 points in the marginally marginal ones. Because the models, by definition, are extremely sensitive to exactly what happens in these marginal constituencies, this could have some effect on the forecast.

Another way to put this is that I prefer to look at evidence such as in polling of marginal districts a bit less literally and more in terms of what hints it might give us about voter behavior in general. For a variety of reasons, in fact, I don’t think that much stock should be placed in the marginals polls themselves:

* The marginals polls are relatively few in number.
* The margianls polls do not always provide an apples-to-apples baseline, i.e., what happens in non-marginal districts.
* Even where polls imply overperformance in marginal districts, the magnitude of the overperformance (although nevertheless important from a forecasting standpoint) is generally smaller that the poll’s reported margin of error, and almost always smaller than the poll’s de facto margin of error when things like house effects are accounted for.
* The marginals polls often combine different types of marginal seats into the same poll, e.g. Labour-Conservative marginals where the LibDems are also a factor and those where they aren’t, each of which may behave somewhat differently.

What would be really helpful, frankly, is if we had some polls of particular constituencies, marginal or otherwise. For instance, suppose that we and Robert and the YouGov folks got together in a room and identified 24 constituencies — say, 6 from Scotland and 18 from England and Wales — that would be helpful from a forecasting standpoint. Some of these would in fact be marginal constituencies (including Labour-Conservative margnials, Labour-Lib Dem marginals, three-way marginals, etc.) and others would be purportedly safe seats to form our control group. We would avoid constituencies in which something unusual had gone on, such as major boundary changes, a scandal involving an incumbent candidate, off-kilter demographics, and the like. If results from each of these representative constituencies were published with some regularity, it would be possible to do some very good forecasting. (The reason I’ve gone into some detail in explaining this is because there could very well be another election within the next year in the event of a hung parliament and it seems plausible that one of the pollsters could agree to do something like this.) However, we are very far from that ideal, and in my view what we actually have — sporadic polling of groups of marginal constituencies — provides only very rough guidance.

Robert also posits an mechanism for why for Conservatives are doing better in these seats: they’re targeting their resources there. While this is undoutedly true, it is presumably also the case that Labour will target its resources to defend those same seats, which could potentially cancel their efforts out. Thus, this is fairly speculative.

I continue to have more problems, however, with their decision not to give bonus to Liberal Democrats in margainl constituencies, provided that they have done so for Conservatives. For one thing, if the overperformance of Conservatives in marginal districts does not reflect something so specific as resource allocation but instead reveals more universal characteristics of voter beahvior, the most reasonale assumption is that LibDems could expect the same effects to work in their favor.

Robert is absolutely right that handling the LibDem marginals gets complicated because they have the potential to win seats both from Labour and Conservatives; a one-size-fits-all approach would necessarily be problematic. Our model instead thinks about each type of LibDem marginal a little differently…

For instance, suppose you have a seat which was Labour 50, LibDem 30, Conservatives 20 in the last election. Here, LibDems would be poised to make some very large gains. There is a large pool of Labour votes which to draw from, and if there is any kind of tactical voting, most of them should go to the LibDems because the Conservatives are not competitive. This is this type of district in which we expect a lot of the LibDems’ gains to be concentrated, including some seats that most forecasters regard as Safe Labour.

On the other hand, in a district like Conservatives 45, LibDems 40, Labour 10, UKIP 5, LibDems might actually have some trouble, even though they only have a small margin to overcome. Their most reliable source of votes — Labour — are not in great abundance here. Even if they were to net some gains from Conservatives, it might not be enough to overtake them.

Finally, suppose you had a constituency like Conservatives 35, Labour 35, LibDems 30. In this case, LibDems do have quite a few Labour votes to pick off (although so do Conservatives) and they would be the favorites to overtake both parties and win.

The point, I suppose, is that once you start to depart from a mechanistic application of uniform swing — as Robert and his team have (rightly, IMO) done — it gets a little tricky to handle all of these contingencies in a three-way election. We actually came very close to going with an approach that was very similar to theirs (albeit with a tactical voting adjustment for LibDems as well as Conservatives), but decided not to do so for this reason.

Lastly, I would posit one intuitive argument in defense of our approach. This has been a very dyanmic election. You have three viable parties. You have had, for the first time, televised debates. You have the strong possibility of a hung parliament and the politics surrounding that. You have some indications of a boost in turnout. You’ve had five years since the last election, during which an awful lot has gone wrong ranging from the financial crisis to the fallout from Britain’s participation in the Iraq War (something which was still ramping up in 2005) to last year’s MP scandals. We should expect things to get pretty jumbled up, and the factors that determined who voted for whom in the last election may not apply so much this year.

Generally speaking, uniform swing anticipates about the cleanest possible break from the previous election: a relatively efficient transfer of votes from Labour to LibDems and to a lesser extent Conservatives — one which might allow Labour to preserve a relatively high number of seats with a relatively small number of votes because of the advantageous way in which the votes happened to be distributed for them in 2005. The less orderly the break is, however, the less that 2005 can serve as a reliable benchmark (in some sense, the more reversion to the mean that there is), and therefore the more precarious Labour’s position. I expect a relatively ‘messy’ election, and therefore I continue to anticipate potentially large divergences from uniform swing.

FiveThirtyEight

Nerdfight: Episode IV — Return of the Tories

Comments