Game Theory Says R.A. Dickey Should Throw More Knuckleballs

Of all the strategic elements of baseball, few are more fascinating than the poker game between pitcher and hitter. Each participant knows his strengths and those of his adversary, and that knowledge informs both players’ tactics in a complex entanglement of actions and counteractions.

If the best pitch in a hurler’s repertoire is his fastball, for instance, he might be inclined to use it really frequently. But batters will pick up on that proclivity, and in time, the fastball will lose its effectiveness if it’s not balanced against, say, a change-up — even if the fastball is a far better pitch on paper.

Eventually, we would expect this pitcher’s arsenal to settle into the optimal mix for retiring opposing hitters: a mix of fastballs and change-ups that’s impossible for a batter to exploit.¹ In game theory terms (and assuming the batter adapts accordingly), this is a version of the famous Nash equilibrium, which describes a situation in which neither party in a game has anything to gain by changing his or her strategy.

That’s all, well, theory. But how can we detect which real-life pitchers are closest to their equilibria? One idea is to look for hurlers whose effectiveness is relatively equal on every kind of pitch he throws. And fortunately, Fangraphs tracks not only the frequency with which each pitch type is employed, but also its potency, estimated in terms of runs added or subtracted per 100 pitches. Using that data to find out how balanced a pitcher’s performance is across his entire repertoire, I computed a metric that I’m dubbing the “Nash Score.”

Here’s how it works: Start by measuring for each pitch type the difference² between its effectiveness and that of all the pitcher’s other pitches combined. Then weight those differences according to the frequency with which each pitch is thrown. The resulting average is the Nash Score, a sort of variance that measures whether a pitcher is close to his equilibrium (lower score) or could conceivably benefit from varying the distribution of his pitches (higher score).

Take R.A. Dickey as an example. The Blue Jays starter, known for his mesmerizing knuckleball, throws the pitch 87 percent of the time — about as much as any pitcher in baseball uses his No. 1 pitch. Yet Dickey’s Nash Score isn’t especially low, so under the concept of equilibria outlined above, he should be using the knuckler even more. Dickey’s fastball — his No. 2 (and essentially only other³ pitch) — is far less effective than his knuckler, even in its limited use as a complementary, change-of-pace pitch. According to game theory, Dickey could conceivably boost his overall effectiveness by throwing the knuckleball on an even greater proportion of his pitches.

Going beyond Dickey, here are the (qualifying⁴) pitchers with the most and least optimal mixtures of pitches over the past three seasons,⁵ at least according to this method:

In the chart, a pitch type’s “relative value” is its run value (per 100 pitches) compared to that of all the pitcher’s other pitches combined. So this method surmises that Tanner Roark of the Washington Nationals is closer to operating at equilibrium than any other pitcher⁶ because his two most frequent pitches (fastball and slider) are each barely more effective than the rest of his repertoire and his third choice (change-up) is barely less effective. Based on our game theory of pitch selection, it stands to reason that Roark wouldn’t get much of a performance boost by altering his frequencies.

But is this actually true? Roark was very good in 2013 (mostly out of the bullpen) with what this system considers a less-optimal mix of pitches than he has now. He was merely solid last year (as a starter) even though his pitch frequencies were far closer to the theoretical equilibrium. And he has been pretty dismal in 2015 (mostly back in the ’pen) despite maintaining a supposedly optimal pitch mix. Therein lies one problem with this methodology: Pitcher performance is notoriously unstable in small samples — and small samples are all we have when trying to zero in on the equilibrium frequencies for a pitcher’s current repertoire.

The year-to-year correlation of a pitcher’s Nash Score, for instance, is essentially zero. Optimally mixing pitches one season is no guarantee of doing it the next, in part because per-pitch performance measures are themselves fickle, but also because the batter-pitcher chess match is in a state of constant flux. Furthermore, there’s little relationship between how good a pitcher is⁷ and how well he optimizes his repertoire. The Miami Marlins’ Jose Fernandez is one of the best pitchers on Earth, yet this method says he’s still drastically overusing his fastball and underusing his breaking pitches.

Then again, Fernandez might also be illustrative of how a pitcher can evolve toward his equilibrium. His fastball, despite good scouting assessments and improved velocity, has been average at best (in terms of run-prevention) since the 2014 season.⁸ Meanwhile, his slider has always been nothing short of brilliant on a per-pitch basis. Perhaps it’s no coincidence, then, that since his debut in 2013, Fernandez has gradually thrown fewer fastballs and more sliders.

Moreover, for the entire population of pitchers I looked at, a one-standard-deviation increase in optimality⁹ results in an earned run average about a quarter of a standard deviation¹⁰ lower. It’s not a huge effect (it only amounts to a handful of runs per season), but it does suggest that a pitcher can reap some benefit from trying to find an equilibrium that evens out every pitch’s effectiveness.

Baseball players have always known the value of mixing up pitches to disrupt a hitter’s rhythm. As Hall of Fame pitcher Warren Spahn once said: “Hitting is timing — pitching is upsetting timing.” Now, though, sabermetric tools can help quantify the blend of pitch types to best do so — that is, until the batters adapt and the cycle starts all over again.

Footnotes

Hitters nudge a pitcher toward a better mix every time they correctly guess which pitch is coming.
Squared difference, to be exact.
Dickey uses a third pitch, the change-up, only 2 percent of the time.
Minimum 5,000 weighted pitches, or about 830 pitches per season.
Weighted for recency according to Marcel’s values — namely, a weight of three for 2015, two for 2014 and one for 2013.
At least among pitchers in our 2013-15 sample.
On a runs-saved-per-pitch basis.
He was injured during the 2014 season, only returning from Tommy John surgery last month.
That is, a one-standard-deviation reduction in Nash Score.
Or about a fifth of a run per nine innings.

FiveThirtyEight

Game Theory Says R.A. Dickey Should Throw More Knuckleballs

Footnotes

Comments