It took a couple of bold pickups the week of the trade deadline, but the Kansas City Royals had finally done it.
Solidified themselves as clear front-runners for the American League pennant? Emerged as outright World Series favorites?
Kansas City’s big accomplishment was simply amassing enough talent to break .500 down the season’s final stretch — at least in the eyes of the statistical projections. Although the Royals had never dropped below .566 all season (and had posted the best winning percentage in the AL), leading sabermetric think tank Fangraphs hadn’t pegged them to win more than half of their remaining games until July 26.May 11.">1 For most of the year, Kansas City has had the record of a contender but the forecast of a lightweight.
We’re not picking on Fangraphs. The 79 wins it forecast for the Royals before the season started (barring major personnel changes or extreme breakouts from current players, the preseason forecast largely determines a team’s rest-of-season projection) were actually on the high side. Although KC won 89 games and went to the World Series in 2014, a consensus average of betting over/undershere, plus implied win totals derived from preseason World Series odds when available.">2 and other statistical systemsPECOTA projections for the team, as well a regressed average of its Pythagorean winning percentages over the previous two seasons. ">3 would have pegged the Royals for 76 wins this year, a number that will likely end up at least 15 games low. Any projection system tied to the Royals’ comparatively weak preseason forecast would have been similarly bearish on their future record.
And the Royals aren’t alone: The Houston Astros, Minnesota Twins and New York Yankees could all potentially beat their consensus preseason projections by double digits, while the Oakland A’s, Boston Red Sox, Miami Marlins and Seattle Mariners may undershoot theirs by that margin. Forecasting the fates of 30 different baseball teams has always been tricky work, but this season has seemed so unpredictable that it has sparked extra rounds of self-examination among statheads.
Paradoxically, in an age of unprecedented baseball data, we somehow appear to be getting worse at knowing which teams are — and will be — good.
In an absolute sense, this season’s forecast win totals aren’t any further off than usual.Extrapolating records to 162 games, the root mean square error between actual and predicted wins is lower this year than the seasonal average from 1996 to 2014.">4 But that obscures the way predictions — and, in fact, actual team records — have also gotten more compressed over the years. As a result of the trend toward parity in MLB, preseason projections explain less of the variation among teams’ records now than they have at any point in the last 20 seasons.
Strangely, the projections are doing fine at the player level. Neither hitter nor pitcher projections are necessarily to blame for the downturn in team-level forecasts. If anything, PECOTA is better now at projecting rate statistics for batters than it was five years ago, and at the very least it has gotten no worse on the pitching side. Likewise, PECOTA’s ability to nail playing-time estimates (both plate appearances and innings pitched) has only improved over that span. So in the aggregate, it’s hard to detect the slump in team projection accuracy by looking at the performance of individual player forecasts.
But while PECOTA’s absolute prediction errors are getting smaller across the entire population of MLB players, its squared errors — a gauge more sensitive to outliers — have increased over the last five seasons. For that kind of discrepancy to exist, there can be only one explanation: The big misses are getting bigger, at least relative to the normal, everyday misses. And, notably, more of those extreme errors come when predicting the performance of young players.
By now, it’s no secret that baseball is in the midst of a historic youth movement. As the average age of players has decreased, a lot more of the game’s value has been concentrated among its fresh faces. That’s hailed as a good thing for the game, but it may be a bad thing for projection systems. For hitters ages 24 and younger, we found that absolute prediction errors in their rate statistics are on the rise since 2009, with an even more pronounced trend toward inaccuracy if outliers are given more weight. Since those players now contribute more to the game than at any other point in recent memory, they could be playing a role in driving the recent projection crisis.
There could be other culprits. Teams may be better now at assessing themselves than public metrics are. If the internal projection systems some clubs employ are superior to the ones driving published preseason forecasts, those teams could be buying and selling talent according to a different rubric. As a result, they could be constructing their rosters in a way that would amplify team-level errors in the public forecasts — for example, loading up on publicly underrated players — even if the player-level accuracy of public projections hasn’t changed much.
Then again, maybe it’s all just luck — we mean literally. By definition, the compression of team records across MLB means that random variance is playing a larger role in the standings than it used to. How much larger? Computing the spread of true talent in a season using the standard deviation of team winning percentages, it turns out that a whopping 64 percent of the observed variation among teams so far this season can be explained by binomial luck — by far the highest single-season proportion of the past two decades.
Even if that number regresses a bit over the season’s final third, 2015 will shatter the previous post-19951994 strike.">5 record for luck’s sway over team winning percentages. This fact alone may go a long way toward explaining why projections are struggling.
It’s tough to know what all of this means for a team like Kansas City. The Royals were smart to go all-in at the trade deadline, and as an older team they figure to be less affected by the predictive uncertainty currently plaguing baseball. Ironically, though, that means we should probably be more confident in the relatively unimpressive rest-of-season forecast set for them by a site like Fangraphs, which still regards the Royals as a team with 84-win true talent even after accounting for their deadline pickups.strained groin that will keep star outfielder Alex Gordon out for a few more weeks.">6
It’s a long-held saying that baseball’s playoffs are a crapshoot, but the unexpectedly great performances of teams like Kansas City this year might indicate the regular season is headed in that direction, too.