Baseball Scouts Use Numbers, Too

Joe Mauer broke the 2-to-8 scouting scale before I even had a chance to tell you how it works. In the spring of 1999, Mauer was years away from the perennial All-Star catcher he’d become for the Minnesota Twins. He was only 15 years old and in his sophomore season at Cretin-Derham Hall High School in St. Paul, Minnesota, but he was already good enough for a veteran scout like Mike Larson to have heard of him. Larson was so intrigued that one afternoon he drove to the Metrodome in nearby Minneapolis to watch Mauer play in a scrimmage.

That day, Larson watched Mauer’s balanced, upright stance, his startling bat speed, and his rare ability to replicate his smooth left-handed swing over and over.¹ By the time he was a senior, Mauer had become one of the best amateur players Larson had ever seen. When it came time to rate Mauer’s hitting ability, Larson didn’t hesitate: “That’s the only player,” Larson told me, “we have ever put a future 8 hit grade on as a high schooler.”

That 8 grade is the best a player can get for one of his skills, according to the scale that scouts use to grade players. Non-pitchers² are given a present and future grade on each of five tools (hitting ability, power, running speed, arm strength and fielding). A 2 grade is poor, 5 is major league average, 8 is exceptional.

Scouts have been grading prospects like this for decades, navigating a minefield of variables.³ They require a standardized process to measure ability that isn’t based solely on statistical output. Therein lies the beauty of the 2-to-8 scale. “It’s a simple formula,” said Don Pries, a former Orioles executive and former director of the MLB Scouting Bureau. “It’s easy for everyone to read, and for everyone to understand.” To scouts, it’s a lingua franca.

In this era of advanced analytics, the language may seem antiquated. But it provides scouts with an objective rubric that guides them through an inherently subjective process. Around the world, there are countless young prospects, some of whom haven’t played much organized baseball. Even an encyclopedic site like Baseball-Reference.com can’t catalog them all. But using the 2-to-8 scale, scouts can.

“At the end of the day, you’re getting what you really care about from the scout: Who’s better in this particular attribute,” said Sig Mejdal, a former NASA researcher whose title — Houston Astros director of decision sciences — sounds lifted from a short story by Philip K. Dick. But, Mejdal cautioned, “a 60 [rating] to one scout is not a 60 to another scout. That’s the inherent problem to any rating system.”

Long before sabermetricians infiltrated Major League Baseball’s front offices, the 2-to-8 scale allowed scouts to quantify the game’s future. In fact, the scouts’ grading system was unwittingly ahead of its time. As several writers have pointed out, if 5 is the mean, then every point above or below that is a standard deviation away from major league average. It’s highly unlikely that old-school baseball scouts ever actually discuss standard deviation while timing a 17-year-old flamethrower’s fastball, but the scale is mathematically sound.

It’s still unclear who came up with the 2-to-8 scale. “I don’t know why the scale uses those numbers,” longtime Royals scouting director Art Stewart wrote in his new memoir, “The Art of Scouting.” “I’m not sure anyone else does, either.”

There’s a temptation to say it was the creation of Branch Rickey, the visionary baseball executive, whose career as a general manager spanned 1913 to 1955. Rickey fathered the modern farm system, pioneered analyzing players’ “tools,” and at some point used a numerical scouting scale. But his wasn’t 2 to 8. Rickey’s went from 0 to 60, with 30 being average. (His grandson, Pacific Coast League president Branch Barrett Rickey, said in an email that the system was touted “for not allowing the more timid graders to have an easy refuge.”)

And it wasn’t Al Campanis, either. Campanis was a Rickey disciple and the Dodgers GM from 1968 to 1987 who also came up with his own scale. According to author Kevin Kerrane’s 1984 book “Dollar Sign on the Muscle,” a classic among baseball wonks, Campanis’s system ranged from 60 to 80, with 70 being average.

“I needed something more refined, so I went to numbers,” Campanis told Kerrane. “I thought like a schoolteacher: 70 is a passing grade, so that can represent the major league average on arm or speed or whatever, and 60 and 80 can be the extremes.”

For more people to understand the subjective judgments of a few, baseball needed to standardize its scouting lingo. That meant it needed to centralize it. In 1974, nine years after the first amateur draft, 17 teams each chipped in $120,000⁴ to create the MLB Scouting Bureau.⁵ Its full-time scouts, stationed around North America and the Caribbean, still compile reports made available to all of the league’s teams. The bureau “allows the club to get information for a fraction of the price of having two full-time scouts of their own,” current director Frank Marcos said recently. “They’re getting a lot more bang for the buck.”

At the time, former Milwaukee Brewers GM Jim Wilson, the first head of the bureau, and Pries, then the assistant director, were busy figuring out what direction they wanted the newly formed organization to take. It’s still unclear exactly how Wilson (who died in 1986) and Pries decided on the 2-to-8 scale, but Pries told me they settled on the concept after a brainstorming session. They wanted the bureau to implement a uniform method of evaluating players, and the 2-to-8 idea stuck.

Now, the 2-to-8 scale is part of scout training. In 1989, the bureau began offering a scout development program, a 12-day training course during which evaluators learn, first and foremost, how to spot 5s. “Every scout knows what average looks like,” Grantland writer and former editor-in-chief of Baseball Prospectus Ben Lindbergh wrote last year in a dispatch from scout school, “so when he’s assessing a tool, he pictures its average equivalent and adjusts upward or downward from there.”

The bureau’s exhaustive scouting manual also teaches the 2-to-8 scale in great detail. Beyond explaining basics like what grade to give a pitcher’s 97 mph heater, it also includes sample scouting reports, lists of current major leaguers to which scouts are encouraged to compare prospects, and suggested language that scouts can use to describe players. There’s even a page that lists “Homerisms,” a series of empty expressions — like “Built like a Greek God” and “Got some hot dog in him” — that the Knights of the Keyboard may love, but actually say zilch about a prospect.

But no matter how standardized the scouts’ approach is, the act of rating players is still founded in subjectivity. This is why famously forward-thinking Oakland Athletics GM Billy Beane advocated for “performance scouting,”⁶ an approach that, as Michael Lewis wrote in “Moneyball,” “directly contradicts the baseball man’s view that a young player is what you can see him doing in your mind’s eye. It argues that most of what’s important about a baseball player, maybe even including his character, can be found in his statistics.” It didn’t rely on the whims of scouts, who are less interested in analyzing a prospect’s past performance than they are in predicting his future.

The 2-to-8 scale’s numbers can be deceptive. As Lindbergh noted: “Not all 8s are equally rare: There are more 8 runners than there are 8 hitters. Some grades loosely correspond to big league performance: Someone with 7 power, for instance, can hit 27 to 34 homers at the major league level. But it doesn’t always work that way: a 7 fastball is 94 or 95 mph, but a heater that hard could be a 6 if it has lousy life.”

Overall Future Potential, an extension of the 2-to-8 scale that’s calculated by adding up a player’s future grades and multiplying by two, is also flawed. For example, Larson gave Mauer future grades of 8 (hitting), 6 (power), 3 (speed), 6 (arm) and 6 (fielding). But not all tools should be weighted equally. “Now that we have a more precise understanding of what makes a baseball player valuable,” Jeff Sullivan wrote for this website in April, “we know, for example, that it’s more important to be able to hit for power than it is to throw the ball real fast.”

Another problem is the extra-credit section of OFP. Because Larson believed Mauer’s OFP was too low at 58, he tacked on five points for “outstanding makeup.” That made it 63, a number that projected out to future success. According to the manual, a scout is allowed to add or subtract points based on his “own scouting instinct.” In hindsight, Larson says, he “should’ve raised it 10 to 12 [points] to get him closer to a 70.”

Mejdal compared the 2-to-8 scale, warts and all, to the QWERTY keyboard. Efficient or not, it’s here to stay. And that’s not necessarily a bad thing. “The tools are a wonderful heuristic to walk [scouts] through to get to the overall value of a player,” Mejdal said. “It’s certainly not perfect, but it’s a very good heuristic.”

It’s also not the only heuristic; most successful teams use a variety of assessment methods. In an oft-cited Baseball Prospectus column written in the wake of “Moneyball,” Dayn Perry expressed his frustration over people asking whether an organization should rely on scouts or statistics. “My answer,” he wrote, “is the same it would be if someone asked me: ‘Beer or tacos?’ Both, you fool.”

And as long as there are amateur players to evaluate, scouts will make sight-based judgments. That, of course, means they’ll be using the 2-to-8 scale. Last year, a decades-old scouting report surfaced that made Mauer’s look almost pedestrian. The player had impressed scout Ken Gonzalez so much that he handed out future 7s and 8s in power, speed, arm strength and fielding. On his report, dated April 18, 1985, Gonzalez wrote: “The best pure athlete in America today.”

His name? Vincent Edward Jackson. To most of us, reducing someone as mythically powerful as Bo Jackson to a series of single-digit numbers is decidedly unromantic. To scouts, it’s necessary.

Special thanks to the Society for American Baseball Research’s Rod Nelson, who helped guide my research.

Footnotes

In his first at-bat of the game, against the best pitcher in the state, Mauer stroked the first fastball he saw up the middle. His next time up, facing the same pitcher, he launched a home run over the right-center field fence and into the football press box.
Pitchers are judged on fastball, curveball, slider and a fourth pitch.
For example: the way a player’s coach uses him, the size of the ballpark he plays in, and the level of competition he regularly faces.
Kerrane reported this in “Dollar Sign on the Muscle.”
Some teams resisted at first, initially choosing not to participate in the service. “The bureau teams kind of represented collectivism,” Kerrane said. Major League Baseball didn’t officially oversee the bureau until 1985, when then-commissioner Peter Ueberroth brought it under the authority of his office.
Beane also liked college players, Michael Lewis wrote in “Moneyball,” because “you could project college players with greater certainty than you could project high school players. The statistics enabled you to find your way past all sorts of sight-based scouting prejudices: the scouting dislike of short right-handed pitchers, for instance, or the scouting distrust of skinny little guys who get on base. Or the scouting distaste for fat catchers.”

FiveThirtyEight

Baseball Scouts Use Numbers, Too

Footnotes

Comments