How Meteorologists Botched The Blizzard Of 2015

Roads and trains were shut down across the New York area Monday night and into Tuesday, and for what? It snowed in New York, but only 9.8 inches fell in Central Park after predictions of a foot and a half or more. What went wrong? Forecasters, including yours truly, decided to go all-in on one weather model: the European model (or Euro).

And the Euro was way off. Other models had this storm pegged.¹

Update after update, the Euro (produced by the European Center for Medium Range Weather Forecasting) kept predicting very high snow totals in New York. As of Monday morning’s run, the Euro was still projecting a foot and a half in the city. This consistency was too great for forecasters to ignore, especially because the Euro had been the first to jump on events such as the blizzard of 1996 and Hurricane Sandy. It also was one of the first to predict that a March 2001 storm was going to, like this one, be a bust. The Euro had a good track record.

That consistency, though, hid a great sense of uncertainty. The SREF (or Short-Range Ensemble Forecast), produced by the National Weather Service, collects 21 models (shown below). And Sunday night, the SREF indicated that the storm could be very different. Five of the 21 models in the SREF had (on a 10:1 snow-to-liquid ratio) less than 10 inches of snow falling. Nine of the 21 predicted a foot or less. Only eight could have been said to support 18 or more inches of snow in New York City.

In other words, 57 percent of the SREF members Sunday night suggested the forecasts were far too gung-ho. By Monday afternoon, 11 of the 21 members were on the 10-inches-or-less train. Eight of the 21 still supported big-time snow, but they were a minority.

The SREF members were not alone in being suspicious of so much snow. In Sunday’s 7 p.m. run, all of the other major models were against the Euro.

The American Global Forecasting System (GFS), which was recently upgraded, had only about 20 millimeters (or 8 inches of snow on a 10-to-1 ratio) falling for the storm. Although the GFS is considered inferior to the Euro by many meteorologists, the difference is probably overrated. Both models perform fairly well over the long term, as was pointed out in The New York Times this week. The GFS was showing the storm would stall too far northeast for New York to get the biggest snows. Instead, as we are seeing, those larger totals would be concentrated over Boston.
The GFS solution probably shouldn’t have been ignored given that it was joined by the Canadian’s global model, which had only 25 millimeters (or about 10 inches on a 10-to-1 ratio) falling as snow. The Canadian’s short-range model was slightly more pessimistic than the global. It predicted only about 20 to 25 millimeters (or 8 to 10 inches on a 10-to-1 ratio) of snow.
The United Kingdom’s model, which typically rates as the second-most accurate behind the Euro, was also on the little-snow train in New York. It had only 20 millimeters (or 8 inches on a 10-to-1 ratio) falling as snow.
Even the United States’ short-range North American Mesocale (NAM) model was on board with smaller accumulations, though it would change its tune in later runs and agree with the Euro for a time. On Sunday night, the NAM went with the 20 millimeters of snow.

Put it all together, and there was plenty of evidence this storm wouldn’t be record-setting in New York. Of course, forecasters are going to miss on occasion. Forecasting weather is very difficult. Models aren’t perfect, and forecasters should be practicing meteorology and not “modelology.”

That said, there are a few lessons to be learned:

I’m not sure forecasters (including amateurs like myself) did a good enough job communicating to the public that there was great uncertainty in the forecast. This has been a problem for media forecasters who have historically been too confident in predicting precipitation events. A study of TV meteorologists in Kansas City found that when they predicted with 100 percent certainty that it would rain, it didn’t one-third of the time. Forecasters typically communicate margin of error by giving a range of outcomes (10 to 12 inches of snow, for example). In this instance, I don’t think the range adequately showed the disagreement among the models. Perhaps a probabilistic forecast is better.
No model is infallible. Forecasters would have been better off averaging all the model data together, even the models that don’t have a stellar record. The Euro is king, but it’s not so good that we should ignore all other forecasts.
There’s nothing wrong with changing a forecast. When the non-Euro models (except for the NAM) stayed consistent in showing about an inch or less of liquid precipitation (or 10 inches of snow on a 10-to-1 ratio) reaching New York and the Euro backed off its biggest predictions Monday afternoon, it was probably time for forecasters to change their stance. They waited too long; I’m not sure why.

Meteorology deals in probabilities and uncertainty. Models, and the forecasters who use those models, aren’t going to be perfect. In this case, there was a big storm. It just so happened to be confined to eastern Long Island and southern New England. But that’ll do little to satisfy New Yorkers who expected a historic blizzard.

Footnotes

I’m concentrating on New York City for the simplicity of this post, though this analysis applies to Philadelphia as well.

FiveThirtyEight

How Meteorologists Botched The Blizzard Of 2015

Footnotes

Comments