The House forecast that we released on Friday establishes an over-under line for Republican gains at a net of 47 or 48 seats. But, as I noted at the end of the article, the confidence interval on this forecast is very wide. Its margin of error is about ±30 seats — meaning that a gain of as few as 17 seats, or as many as 78, is entirely possible — and there is a small chance of even larger or smaller gains.

When I noted this on Twitter on Friday, I got a few sarcastic replies: what good is a forecast if it tells you that essentially anything can happen? We’ll return to that question at the end of the article. But first, let’s look at a few numbers.

Currently, the folks at Cook Political consider a total of 87 House races to be either toss-ups or to merely “lean” toward one or the other party. This is an unprecedented number in recent history. At a comparable point in the past six election cycles — that is, with about 25 days to go until the election — Cook Political had put the number of highly competitive races at between 34 and 56; this year’s figure is roughly twice as high.

Cook Political and the other expert forecasters that our model uses have a very good track record — but if they have a flaw, it’s that they can be overly cautious, characterizing some races as being highly competitive when that isn’t necessarily borne out by objective evidence. Perhaps they’re simply hedging too much this year?

Well, I don’t think so, because the purely objective indicators also show an order-of-magnitude increase in the number of competitive races.

The most basic indicator is simply the number of House districts in which both major parties have nominated a candidate. As we’ve noted before, this figure is extremely high this year, largely because of improved coverage by Republicans. In total, 407 of the 435 Congressional Districts have a nominee from both parties this year; this compares with an average of 365 districts over 1998-2008. Looked at another way, only 28 candidates are getting free passes this year — as compared with an average of 70 in recent years.

Now, it’s true that many of these nominees are competing in extremely red or blue districts, and might ordinarily have little chance of winning. But, extraordinary things can sometimes happen in House races, as occurred for instance in Louisiana’s 2nd District in 2008, or Texas’ 22nd District in 2006 — extremely Democratic and Republican districts respectively in which the underdogs prevailed because of unusual circumstances surrounding their opponents.

Also, sometimes in the past, a party has failed to get its act together and has neglected to nominate a candidate even in a district where he’d seem to have a decent chance at winning. Some 15 districts that that Cook Political currently rates as competitive this year — including 7 that they rate as highly competitive (“lean” or “tossup”) –* were not contested* by both major parties in 2008. So the hard work that both Republicans and Democrats have done in finding nominees is contributing to the competitive environment and has ensured that few if any opportunities will be wasted.

A somewhat higher threshold for viability is one based on fundraising numbers. For instance, in how many districts have both the Republican and Democratic candidates raised at least $100,000 in individual contributions? While $100,000 is not enough to run a full-fledged campaign for the Congress, it’s at least enough to get a candidate on the map in all but the most expensive districts.

The answer — as of the July 15 filing deadline (a fresh set of fundraising numbers will be available in about a week) — is 163 districts. This number represents a big increase from past years: there were 116 such districts at a comparable point in the election in 2008, for instance, and 102 in 2006 — and an average of 77 from 1998 until 2004. Although fundraising has become easier because of the Internet and other means, and although the rate of campaign contributions has been increasing much faster than inflation, this is nevertheless an impressive figure.

Another indicator of competitiveness is simply how many districts have seen polls released into the public domain, since campaigns rarely release polls in races in which they have no shot at all, and since media companies are loathe to pay for survey work in districts where the result is a foregone conclusion.

As of today — with about 25 days remaining in the campaign — our database has some kind of polling (including polls released by campaigns) in 150 House districts. This compares to an average of 67 districts at a comparable point in the 2000 through 2008 elections. (The figure was much lower in 1998, but our database had incomplete coverage that year).

Of course, it is also worth looking at exactly what those polls had to say. In how many districts, for example, do we have *at least one poll* showing the race within single digits?

It turns out that in most of the 150 districts in which we’ve seen some polling, at least one pollster has indeed been ambitious enough to show the race within 10 points. All told, there are 109 such districts thus far this year — roughly double the 57 districts of which the same was true at a comparable point in 2008. Not only have an unusually large number of districts received polling this year — but of those, an unusually large fraction have shown a competitive race.

Put all of this information together, as our forecasting model does, and we currently projects 85 House races to be decided by 10 points or fewer. At a comparable point in the 1998 through 2008 cycles, our model would have thought this to be true of an average of only about 45 races.

According to just about every objective and subjective indicator, then, the number of competitive House districts is roughly twice as high as in recent years. *This* is why the margin of error on our House forecast is very wide. If the polling is off by just a little in one direction or another, it could have profound consequences for the number of seats that Republicans are likely to gain. Likewise, there are a great number of districts in which both parties have viable candidates who could overperform or underperform the trends present in the national environment.

Why are so many races competitive? That could merit an article on its own. I suspect much of the reason is that the deterioration in the political environment for Democrats was evident quite early in the cycle — certainly by around August or September of last year — leaving both parties with plenty of time to prepare. The fact that the Internet has made fundraising much less burdensome, and allowed name recognition to be built through a variety of “nontraditional” means, may also play a role.

But whatever the reasons, the dynamics of the battle for the House are much different this year than in the recent past. Viewed in this context, the uncertainty that our model implies should be viewed as a feature rather than a bug. We are certainly not afraid to make bold forecasts where they are warranted: for instance, our model was much quicker than others, after the financial collapse in September 2008, to figure out that John McCain had essentially zero chance of winning the Presidency. As I have written about extensively, it also makes some very certain-seeming forecasts in some individual races for Senate and governor, regarding candidates as 90 percent, 95 percent, or even 99 percent favorites in some contests that others regard as toss-ups.

Sometimes, however, a thorough and objective analysis of the data leads one to the opposite conclusion: that the competition is too sure of itself. Our model figures there is a very wide range of potential outcomes because *that is the only responsible forecast*. We’re not being meek or wishy-wishy: instead, we are firmly, boldly, affirmatively and happily embracing the uncertainty. This is not because of any intrinsic property of our forecasting model; rather, it is because of the particular set of circumstances on the ground this year.

If anything, I worry that our model implies *too little* uncertainty. Generally speaking, forecasting models based on past data tend to overrate their accuracy when applied to out-of-sample data, although I design my models with this principle in mind to try to minimize such effects.

So, if you force it to pick an number, our model projects a Republican gain of about 48 seats (that projection could change, of course, by Election Day). But because of the high amount of uncertainty intrinsic to the forecast, I couldn’t really take any great dispute with a model that made a “best guess” of 56 seats, or 37 seats, instead.

What I can do, however, is warn you away from models (and individuals) that claim to be able to forecast the number of Republican gains with pinpoint accuracy — particularly those that make no effort to take account of the environment on a district-by-district basis. One good bet you might consider taking up with a friend this year, for instance, is asking him to forecast the number of Republican gains within a range of ±5 seats. If our analysis of this race is correct, then *no matter which number he picks*, you’ll be at least a 3:1 favorite to win this bet: for instance, the number of Republican gains has only about a 25 percent chance of falling within 48 seats ±5 (meaning, anywhere from 43 to 53). Even a bet that spotted your friend a range of plus-or-minus *10 seats* should theoretically not lose more than pennies on the dollar.

I’ll acknowledge that this puts us in a somewhat awkward position. We’ll probably get a lot of credit if we make a “best guess” of a gain of 48 Republican seats, and they in fact win 47, or 50 — or 48 exactly. But really, we think the value the model provides to *New York Times* readers is that it is smart enough to know what it can’t and doesn’t know.