After dozens of grueling hours of investigation, FiveThirtyEight can confirm that Punxsutawney Phil is a charlatan. A groundbreaking analysis1 has revealed the Pennsylvania-based groundhog who makes annual predictions about the arrival of spring is not nearly as reliable a prognosticator as those close to him claim. Phil, arguably the world’s most well-known rodent weather predictor, has been forecasting when spring will arrive annually on Groundhog Day since 1887. But when comparing his predictions to historical weather data, he’s only right about a third of the time.
Phil’s predictions, like those of most groundhog prognosticators, depend on whether or not he sees his shadow on the morning of Feb. 2. According to legend, if he sees his shadow, there will be six more weeks of winter. If he doesn’t see his shadow, that portends an early spring (a prediction that feels more worrisome than hopeful in light of climate change). Phil speaks his prediction to the president of the Punxsutawney Groundhog Club in “Groundhogese,” a language only the president can understand, and the president then translates for the masses.
But this is occasionally where chaos can ensue. Tom Dunkel is the club’s vice president and a member of its “inner circle” — a group of 14 local dignitaries (including Phil) that is charged with keeping the Groundhog Day tradition alive. In a recent interview, he said that one year, after Phil predicted an early spring, the region was dumped on with a lot of snow. The club’s president at the time realized he must have misinterpreted Phil’s prediction.
“Because Phil is always right,” Dunkel said, recalling the incident.
But the numbers — THE NUMBERS — tell a different story.
In fairness to Phil and his supposedly sterling record, evaluating these predictions is tricky. What constitutes an early spring? Should a month with an average temperature 2 degrees higher than the historical average count as one? What about 5 degrees? What if all the snow melts and it’s warm and sunny for a week in February, but then March comes in like a lion? Dunkel said the club does not have a specific date or even definition of an early spring.
“There is no sharp guideline on what the temperature will be or the turning point of a particular day,” Dunkel said.
In an effort to be generous in our interpretation (and following the lead of the National Oceanic and Atmospheric Administration’s (NOAA) previous groundhog analysis), our analysis has an early spring bias — only one month needs to have had an above-average temperature for a year to count as an early spring, even if it’s only a slight increase in temperature. Though a half-degree above average is hardly a delightfully early eruption of spring by most standards, it qualifies by our metric.
If we look at Phil’s predictions nationally (as Dunkel encouraged), they were accurate only 36 percent of the time (judging by national average temperatures). But how can just one marmot know a whole nation’s weather? Looking at a more granular level — by region — doesn’t help much. Phil’s accuracy ranged from 50 percent in the Southwest and South to just 36 percent in the West. Even in the Northeast, where Phil’s home state of Pennsylvania is located (and where one might expect him to best forecast weather patterns), he was only 39 percent accurate. That’s similar to what other organizations have found about Phil’s record, such as Time, the Washington Post and NOAA (though our analysis uses different weather data and compares Phil to other prognosticators).
But perhaps — perhaps! — the accuracy of a soothsaying animal meteorologist doesn’t matter. Michael Venos, a database administrator from Roxbury, New Jersey and the creator of countdowntogroundhogday.com — a database of, among other things, animal prognosticators’ historical predictions2 — told FiveThirtyEight that Groundhog Day isn’t about being right or wrong. It’s about fun.
“I don’t know if I should say this, but I don’t put a lot of stock in whatever they predict,” Venos said. “To me it’s definitely just the fun of the ceremony and finding out about all the different, strange [traditions].”
That’s because there are many different and strange animals who claim to tell you the weather’s future. Most are groundhogs, but other species have taken up the role as well. In Concord, Ohio, a cat named Casimir predicts the changing season based on how he consumes a dinner of pierogies (if he eats them “sloppily,” it means a lengthy winter). At the Oklahoma City Zoo, grizzly bear brothers Will and Wiley prognosticate by choosing among different boxes of treats. And in Palm Desert, California, a desert tortoise named Mojave Maxine heralds the start of spring by emerging from her den after her annual brumation (the reptilian answer to hibernation).
To get a more complete picture of Phil’s accuracy rate, we compared it to the track records of eight of his competitors (some sentient, others not) — one from each of the eight other climate regions — from 2012 to 2021.3 Overall, prognosticators’ average accuracy rates ranged from 26 to 63 percent, with greater variability in individual regional forecasts. General Beauregard Lee of Jackson, Georgia (widely considered to be the Punxsutawney Phil of the South) had the best overall rate, averaging 63 percent accuracy across all nine regions from 2012 to 2021.
One thing that appeared to aid the accuracy of each animal’s forecast was how often they predicted an early spring. The three critters that foretold early flowers at least 60 percent of the time — Buckeye Chuck, Lee and Jimmy the Groundhog — had three of the four best overall prediction rates. Meanwhile, Prairie Dog Pete and Chuck Wood predicted early spring 20 and 0 percent of the time, respectively, and finished with the worst and second-worst accuracy. And as mentioned above, our analysis is biased toward an early spring because it’s easily triggered by a slightly warmer temperature than average, giving these prognosticators an edge.
However, if we define an early spring as a year in which February or March temperatures are at least 5 degrees above average (arbitrary, but a meaningful difference), then Lee’s accuracy drops to 30 percent in his local region. Meanwhile, applying that same threshold to Phil’s predictions (from 1994 to 2021) actually improves his accuracy to 50 percent in his home region and 68 percent in the Northwest, South and Southeast.
Each critter’s accuracy depends on how you define early spring or long winter, making the entire prognosticating business feel intentionally abstract. It’s almost as if the entire Groundhog Day industrial complex was designed to be just vague enough to make any quantitative analysis impossible, thereby preserving the rodent prognostication supremacy indefinitely. Almost.
“Groundhog Day is about fun, really, not about scientific facts,” said Dunkel. “We had a former [club] president who once said that there are a lot of serious things happening around the world, and Groundhog Day is not one of them.” Surely there’s a way to test that with data …