Of all the strategic elements of baseball, few are more fascinating than the poker game between pitcher and hitter. Each participant knows his strengths and those of his adversary, and that knowledge informs both players’ tactics in a complex entanglement of actions and counteractions.
If the best pitch in a hurler’s repertoire is his fastball, for instance, he might be inclined to use it really frequently. But batters will pick up on that proclivity, and in time, the fastball will lose its effectiveness if it’s not balanced against, say, a change-up — even if the fastball is a far better pitch on paper.
Eventually, we would expect this pitcher’s arsenal to settle into the optimal mix for retiring opposing hitters: a mix of fastballs and change-ups that’s impossible for a batter to exploit.1 In game theory terms (and assuming the batter adapts accordingly), this is a version of the famous Nash equilibrium, which describes a situation in which neither party in a game has anything to gain by changing his or her strategy.
That’s all, well, theory. But how can we detect which real-life pitchers are closest to their equilibria? One idea is to look for hurlers whose effectiveness is relatively equal on every kind of pitch he throws. And fortunately, Fangraphs tracks not only the frequency with which each pitch type is employed, but also its potency, estimated in terms of runs added or subtracted per 100 pitches. Using that data to find out how balanced a pitcher’s performance is across his entire repertoire, I computed a metric that I’m dubbing the “Nash Score.”
Here’s how it works: Start by measuring for each pitch type the difference2 between its effectiveness and that of all the pitcher’s other pitches combined. Then weight those differences according to the frequency with which each pitch is thrown. The resulting average is the Nash Score, a sort of variance that measures whether a pitcher is close to his equilibrium (lower score) or could conceivably benefit from varying the distribution of his pitches (higher score).
Take R.A. Dickey as an example. The Blue Jays starter, known for his mesmerizing knuckleball, throws the pitch 87 percent of the time — about as much as any pitcher in baseball uses his No. 1 pitch. Yet Dickey’s Nash Score isn’t especially low, so under the concept of equilibria outlined above, he should be using the knuckler even more. Dickey’s fastball — his No. 2 (and essentially only other3 pitch) — is far less effective than his knuckler, even in its limited use as a complementary, change-of-pace pitch. According to game theory, Dickey could conceivably boost his overall effectiveness by throwing the knuckleball on an even greater proportion of his pitches.
In the chart, a pitch type’s “relative value” is its run value (per 100 pitches) compared to that of all the pitcher’s other pitches combined. So this method surmises that Tanner Roark of the Washington Nationals is closer to operating at equilibrium than any other pitcher6 because his two most frequent pitches (fastball and slider) are each barely more effective than the rest of his repertoire and his third choice (change-up) is barely less effective. Based on our game theory of pitch selection, it stands to reason that Roark wouldn’t get much of a performance boost by altering his frequencies.
But is this actually true? Roark was very good in 2013 (mostly out of the bullpen) with what this system considers a less-optimal mix of pitches than he has now. He was merely solid last year (as a starter) even though his pitch frequencies were far closer to the theoretical equilibrium. And he has been pretty dismal in 2015 (mostly back in the ’pen) despite maintaining a supposedly optimal pitch mix. Therein lies one problem with this methodology: Pitcher performance is notoriously unstable in small samples — and small samples are all we have when trying to zero in on the equilibrium frequencies for a pitcher’s current repertoire.
The year-to-year correlation of a pitcher’s Nash Score, for instance, is essentially zero. Optimally mixing pitches one season is no guarantee of doing it the next, in part because per-pitch performance measures are themselves fickle, but also because the batter-pitcher chess match is in a state of constant flux. Furthermore, there’s little relationship between how good a pitcher is7 and how well he optimizes his repertoire. The Miami Marlins’ Jose Fernandez is one of the best pitchers on Earth, yet this method says he’s still drastically overusing his fastball and underusing his breaking pitches.
Then again, Fernandez might also be illustrative of how a pitcher can evolve toward his equilibrium. His fastball, despite good scouting assessments and improved velocity, has been average at best (in terms of run-prevention) since the 2014 season.8 Meanwhile, his slider has always been nothing short of brilliant on a per-pitch basis. Perhaps it’s no coincidence, then, that since his debut in 2013, Fernandez has gradually thrown fewer fastballs and more sliders.
Moreover, for the entire population of pitchers I looked at, a one-standard-deviation increase in optimality9 results in an earned run average about a quarter of a standard deviation10 lower. It’s not a huge effect (it only amounts to a handful of runs per season), but it does suggest that a pitcher can reap some benefit from trying to find an equilibrium that evens out every pitch’s effectiveness.
Baseball players have always known the value of mixing up pitches to disrupt a hitter’s rhythm. As Hall of Fame pitcher Warren Spahn once said: “Hitting is timing — pitching is upsetting timing.” Now, though, sabermetric tools can help quantify the blend of pitch types to best do so — that is, until the batters adapt and the cycle starts all over again.