For a better browsing experience, please upgrade your browser.

FiveThirtyEight

Sports

Dick Pfander has spent most of his life collecting and analyzing box scores from every NBA game since the league’s founding. He did most of his work in solitude, by hand, before the age of personal computers. And he did it simply for his own pleasure, surrounded by supportive family members who cared neither about basketball nor statistics, let alone their intersection.

Today, his analog hobby is paying digital dividends for stats-obsessed basketball fans. His work has helped fill gaps in the league’s statistical record for both its official website and the leading independent reference site. The project continues — but without Pfander.

An all-consuming “job”

Pfander started clipping box scores from newspapers as a teenager in Battle Creek, Michigan, in the late 1940s. He did it during high school, after his marriage to Colette Waterman, through jobs as a teacher and with the Defense Department, through the birth of his three children, and through Colette’s death, in 2009.

On vacations, Dick and Colette would travel to places where “he thought there might be a newspaper of use to him” in the local library’s archive, his daughter Julaine Eddy said in a telephone interview. “She’d drop him off and go do things around town while he sat in front of the microfiche machine.”

Pfander’s children say they and their mother didn’t share his passion for NBA stats, but they didn’t resent it. It was just one way he expressed his love for basketball and for statistics. He also refereed basketball games and compiled the stats for local youth baseball tournaments and swim meets. He didn’t mind that none of his children played basketball or got into stats.

Most of all, he sat in front of the television, “going back and forth between watching the basketball and working on the stats,” said his son, Greg. “It never bothered me that he did it — it was his thing. It just seems like that’s my dad, that’s what he always did.”

Julaine’s memory of her father working on his stats is vivid. “Dad had this huge desk at home, and it was him sitting at that desk. The TV was visible from that desk, and he sat there and worked,” she said. “And as a kid, you think, ‘That must be a job he’s doing.’”

Why did he do it? Pfander, 79, isn’t expansive on the topic. “It was a hobby for me,” he said in an interview last week. “It was a fun thing for me to do.” He considered himself a statistician long before NBA teams started hiring statisticians. “I had always been interested in statistics,” Pfander said, and “I kind of liked doing statistical-type things.”

He added, “I don’t think anybody would do all that unless they enjoyed it.”

Dick Pfander
Courtesy of Colleen Greff Dick Pfander

His hobby nonetheless might have stalled decades ago if he hadn’t connected with his best single box-score source, a kindred spirit who collected NBA stats for a living: Harvey Pollack. The Philadelphia 76ers’ director of statistical information has been working in pro basketball since before the NBA, debuting with the league’s forerunner, the Basketball Association of America, in 1946. The two men aren’t clear on when they began collaborating. Pfander said in a telephone interview he thinks it was 1956; Pollack, who is 92, recalled that they started working together when he expanded his annual NBA stats book in the 1970s. They met at most a few times, but they communicated at least a dozen times each year by phone and mail.

Pollack, who has worked for the Sixers since the team’s first season in 1963-64, was also collecting box scores for every game, though his collection didn’t go back as far as Pfander’s. Pollack would send Pfander his book of box scores, and Pfander would send back results of his data analysis — tidbits such as when the league’s millionth point was scored. That archival work helped when the NBA marked later milestone league totals. “He’s the one who started with it,” Pollack said. “That was his idea. I’ve kept it alive ever since.”

Pollack put many of these small discoveries in his annual books, always crediting Pfander, whose stats were meticulously calculated and, before he started using a computer, meticulously handwritten, with a fountain pen. “He has the best handwriting you have ever seen,” said his daughter Colleen Greff.

Pfander said he used to check the box scores Pollack sent, to make sure the numbers added up. Then he would generate 33 or 34 different stats for Pollack in return. Pollack said his latest book includes at least 40 stats furnished by Pfander.

Filling holes in the record

Part of the fun of sports is measuring today’s players and teams against their predecessors, and you can’t do that without a complete record of what past players did. Every sport’s fan base includes completists, people who feel unsettled by the lack of certainty in the records.

Ten years ago, Justin Kubatko founded the website Basketball-Reference.com as a resource for fans who want to know every detail about basketball history. Seven years ago, he left his job teaching statistics at Ohio State University to work full time on basketball stats.

Before he connected with Pfander, Kubatko had game-by-game data on his site going back only to the mid-1980s.

“I’m a completist,” Kubatko said in a telephone interview. “It did kind of bug me. We had 40 years of information that was just not there. I was also a realist. I knew there was really no easy way to acquire that data.”

Kubatko doesn’t recall when he first heard of Pfander’s box-score collection — perhaps from the discussion boards at the basketball analytics site APBR.org, where Pfander has been a frequent topic of discussion. “I got his contact information, we talked for a few weeks, we worked out a deal, and we bought what he had,” Kubatko said. “He is an extremely generous and extremely nice man.”

By this time, Pfander had digitized his box scores, scanning and sorting them — meticulously, of course. The disk Kubatko received, in 2008, had folders for each year and subfolders for months and for days. Even so, the box scores were saved as images, not spreadsheets or databases, so it wasn’t easy to add them to a stats database.

Kubatko sat on the trove for at least a year. “Then we said, ‘You know what, this is kind of silly,’” he recalled. “People are probably interested in what we have as it is.” He wrote a script to link the team abbreviations Pfander used with those on the site and to pair each scan with that game’s unique Basketball Reference ID. Then he fixed the mismatches that arose from errors in the box scores or in the site’s schedule data.

In January 2012, Kubatko announced that Basketball Reference now had every box score for every NBA game. His blog post credited Pfander, “who did the lion’s share of the work for this project.” At Grantland a few days later, Robert Mays wrote about the pair and Pfander’s unlikely collection.

Fans wouldn’t be able to work with the old stats, though, until Kubatko could get the games into the database. He found Sean Wrona, a champion competitive typist, who “keyed in all this stuff for us off the scans, and we paid him to do that,” Kubatko said. “He did the first several seasons we put out there. All I’ll say is I found a more efficient way to handle the other seasons.”

Wrona said in an email1 that each box score took him about 10 minutes, or just five when it was abbreviated and missing some statistical categories. “Accuracy is far more important than speed for archival work (unlike for competitive typing, where raw speed is far more important), so I don’t come close to my peak typing speed while archiving, but it still helps,” Wrona said. He continues to do data-entry work for Sports Reference, Basketball Reference’s parent company.

Wrona’s typing was the fastest part of the project. To merit inclusion in the database, a box score had to make sense: Players’ numbers had to add up to team totals, for instance. Newspaper box scores are the first draft of basketball statistical history. Kubatko estimated that each season had about 100 errors. He could resolve most using online news databases. Kubatko also sought help from his readers, posting on the blog his “most wanted” list for box scores where the sum of players’ scores didn’t equal the team total.

The NBA steps in

Kubatko’s effort to build a publicly accessible archive of the game’s history made slow but sure progress.2 Last March, he announced a big breakthrough: The database now went back to 1964-65.

Since then, the work has stalled: Basketball Reference has added just one more year. One reason is that Kubatko left Sports Reference in August, citing “creative differences.”3

Sports Reference’s founder and president, Sean Forman, says the work to fill in the remaining 18 years of box-score data continues, but it isn’t a priority. That’s because the NBA itself announced in February 2013 that it had posted the box score for every game, all the way back to 1946-47, to NBA.com. “We want to be first on things,” Forman said. “Now that the NBA already has that data up, it’s a little bit less of an impetus.”

So where did the NBA get that data? It always had “thorough statistical records,” league spokesman John Acunto said, but hadn’t figured out how to best publish most of them online until partnering with SAP, the tech company that powers the NBA’s revamped stats site. Before the site relaunch last year, the NBA offered game-by-game stats online going back only to 2007, Acunto said.

Pfander’s box-score archive was one source the league used to fill gaps. “We have acquired from Pfander and others additional references and sources to cross-reference and validate our information,” Acunto said.

Pete Palmer, a veteran sports statistician, said he and Pfander collaborated on using the box-score collection to correct errors in the league’s records. Palmer said that Pfander also sold his box scores to the statisticians at the Elias Sports Bureau.4

Steve Hirdt, a statistician at often-secretive Elias, declined to comment on whether the company worked with Pfander. “It’s just not something we discuss,” he said by phone.

The legend retires

Pfander is far from the only amateur completist to aid sports historians. Pollack, of the Sixers, credited regular contributions from other basketball enthusiasts in his book over the years. David W. Smith has led a team of volunteers in the ongoing, ambitious effort to fill the record of every Major League Baseball play. Wrona built an online auto-racing database for a dozen series. Forman cited the contribution of Ed Washuta, who entered minor league baseball stats over a century old. “Pfander is an exemplar in that he has produced such a tremendous set of data for the public,” Forman said.

The work of filling and correcting the NBA statistical record goes on. Many of the older box scores contained only field goals made, free throws made and points scored for each player. And some box scores are missing players, such as these on NBA.com. Fans often write to the league to suggest corrections, which it makes when appropriate, Acunto said. Sports Reference similarly invites corrections from readers.

Pfander, a user of the site, continued to help Sports Reference’s efforts after shipping his data. “Some of those older scans are really poor,” Kubatko said. “It was very hard to read some of them. He was trying to find replacements for those, and occasionally he would send us stuff that would give us a better scan.” Pfander also did some digitizing work himself, typing old box scores into Microsoft Word documents — Word tables were his instrument of choice for organizing his digital data. “He was committed, definitely,” Kubatko said.5

The work could go on forever. “You’re never going to get a perfect set of box scores,” Kubatko said. “It’s just not going to happen.”

Today, though, Pfander is no longer actively working on NBA statistics. In November 2012 — just as Basketball Reference’s effort to input his data was gaining momentum — Pfander had surgery for a brain aneurysm. Afterward, he was confused, “and he actually made the joke, something about, you should never have brain surgery during NBA season, because it just messes up all the statistics,” his daughter Colleen recalled.

Later that month, on his 78th birthday, Pfander suffered a stroke. The stroke has affected his short-term memory, Pfander and his children say. They tell him that the speech he gave at his wife’s memorial service was the best they’ve ever heard, and he laments that there’s no recording for him to listen to and remind him of what he said.

The stroke also has interrupted Pfander’s statistical work. He continues to watch the NBA from his current home at a care facility but, he said, “I’ve had some physical setbacks, and I just don’t do it anymore.” He added, “I started a couple of different times, and I guess I’d say I don’t have the interest in making all those copies of box scores and then filling in the blanks so that I can add ‘em up and see strings of double-figure scoring and things like that.”

His daughter Julaine has cleaned up his computer’s desktop so that it has just two folders, one of them labeled “basketball stats.” She’s also backed up his data onto an external hard drive. It’s all ready for him to dive back in. “He doesn’t seem to want to do that,” she said. “Part of me thinks he worries that if he doesn’t do it exactly right, he might mess something up. He knows he has all this great data out there, and the last thing he would want to do is not do it the way he used to do it.”

“It’s so sad now,” Colleen said. “It was such a passion for him, oh my goodness, he could not bear to fall behind. Now the desire to do statistics is just not there.”

If he never revisits basketball stats, Pfander’s legacy is secure, and that’s some comfort. “I think he was probably happy that all of this work he had done — because he had really done it for really personal reasons — he was happy that it would be able to reach a larger audience, and that other people would be able to benefit from the work he had done,” Kubatko said.

Of his hobby’s role in completing the Basketball Reference database, Pfander said, “It makes me feel that it was useful.”

Footnotes

  1. Presumably typed quickly: He’s been clocked above 250 words per minute. ^
  2. In October 2012, Kubatko announced partial box scores were available for the 1982-83 and 1983-84 seasons. In December, Basketball Reference’s database stretched to 1981-82. The next month, he’d added one more year. By March of last year, the database went back to 1976-77, the first year after the NBA merged with the American Basketball Association. ^
  3. He retains a stake in the company, and is, as of earlier this month, an NBA consultant, through his company, Statitudes LLC. ^
  4. “As I remember it, some of the box scores were hard to read and Dick had to prepare typed files for Elias,” Palmer said by email. ^
  5. Pfander also sent along his ABA box scores, and inputting those is another project waiting to be completed. ^

Filed under , , , , , ,

comments Add Comment

Powered by WordPress.com VIP