What the Fox Knows

FiveThirtyEight is a data journalism organization. Let me explain what we mean by that, and why we think the intersection of data and journalism is so important.

If you’re a casual reader of FiveThirtyEight, you may associate us with election forecasting, and in particular with the 2012 presidential election, when our election model “called” 50 out of 50 states right.

Certainly we had a good night. But this was and remains a tremendously overrated accomplishment. Other forecasters, using broadly similar methods, performed just as well or nearly as well, correctly predicting the outcome in 48 or 49 or 50 states. It wasn’t all that hard to figure out that President Obama, ahead in the overwhelming majority of nonpartisan polls in states such as Ohio, Pennsylvania, Nevada, Iowa and Wisconsin, was the favorite to win them, and was therefore the favorite to win the Electoral College.

Instead, our forecasts stood out in comparison to others in the mainstream media. Commentators as prestigious as George F. Will and Michael Barone predicted not just a Mitt Romney win, but a Romney sweep in most or all of the swing states. Meanwhile, some news reporters defaulted to characterizing the races as “toss-ups” when the evidence suggested otherwise.¹

The other reason I say our election forecasts were overrated is because they didn’t represent the totality, or even the most important part, of our journalism at FiveThirtyEight. We also covered topics ranging from the increasing acceptance of gay marriage to the election of the new pope, along with subjects in sports, science, lifestyle and economics. Relatively little of this coverage entailed making predictions. Instead, it usually involved more preliminary steps in the data journalism process: collecting data, organizing data, exploring data for meaningful relationships, and so forth. Data journalists have the potential to add value in each of these ways, just as other types of journalists can add value by gathering evidence and writing stories.

The breadth of our coverage will be much clearer at this new version of FiveThirtyEight, which is launching Monday under the auspices of ESPN. We’ve expanded our staff from two full-time journalists to 20 and counting. Few of them will focus on politics exclusively; instead, our coverage will span five major subject areas — politics, economics, science, life and sports.

Our team also has a broad set of skills and experience in methods that fall under the rubric of data journalism. These include statistical analysis, but also data visualization, computer programming and data-literate reporting. So in addition to written stories, we’ll have interactive graphics and features. Within a couple of months we’ll launch a podcast, and we’ll be collaborating with ESPN Films and Grantland to produce original documentary films. You’ll find us on television and radio, and on Twitter, Instagram and Facebook. We’ll share data and code on Github.

Our logo depicts a fox (we call him Fox No. 9²) as an allusion to a phrase originally attributed to the Greek poet Archilochus: “The fox knows many things, but the hedgehog knows one big thing.” We take a pluralistic approach and we hope to contribute to your understanding of the news in a variety of ways.

By no means do we think that everything can be broken down into a formula or equation. On the contrary, one of our roles will be to critique incautious uses of statistics when they arise elsewhere in news coverage. At other times, we’ll explore ways that consumers can use data to their advantage and level the playing field against corporations and governments.

Our methods are not meant to replace “traditional” or conventional journalism.³ We have the utmost admiration for journalists who gather original information and report original stories. Our staff includes alumni from traditional news organizations like The New York Times, The Wall Street Journal, The Guardian and The Washington Post (along with others from digital news organizations, blogs and from outside journalism entirely).

Still, I would never have launched FiveThirtyEight in 2008, and I would not have chosen to broaden its coverage so extensively now, unless I thought there were some need for it in the marketplace. Conventional news organizations on the whole are lacking in data journalism skills, in my view. Some of this is a matter of self-selection. Students who enter college with the intent to major in journalism or communications have above-average test scores in reading and writing, but below-average scores in mathematics. Furthermore, young people with strong math skills will normally have more alternatives to journalism when they embark upon their careers and may enter other fields.⁴

This is problematic. The news media, as much as it’s been maligned, still plays a central a role in disseminating knowledge. More than 80 percent of American adults spend at least some time with the news each day. (By comparison, about 25 percent of Americans of all ages are enrolled in educational programs.)

Meanwhile, almost everything from our sporting events to our love lives now leaves behind a data trail. Much of this data is available freely or cheaply. There is no lack of interest in exploring and exploiting it: Google searches for terms like “big data” and “data analytics” have grown at exponential rates, almost as quickly as the quantity of data itself has grown.

And yet, as I describe in my book, “big data” has not yet translated into widespread gains in economic conditions, human welfare or technological growth. Some individual companies and industries, and some branches of science, have employed data in constructive ways. But “Moneyball” stories are more the exception than the rule.

Journalism is far from the whole problem. Science, government, academia and the private sector also have struggled to find the signal in the noise. But journalism is our chosen profession. There is both a need for more data journalism and an opportunity to build a business out of it.

That opportunity has required us to think deeply about the strengths and weaknesses of conventional approaches to journalism. One of our first attempts came in the form of a two-dimensional chart, which I shared several weeks ago with Jack Dickey of Time magazine. The chart posits a distinction between quantitative versus qualitative approaches on the one hand and rigorous versus anecdotal approaches on the other.

The point is that data journalism isn’t just about using numbers as opposed to words. To be clear, our approach at FiveThirtyEight will be quantitative — there will be plenty of numbers at this site. But using numbers is neither necessary nor sufficient to produce good works of journalism.

Indeed, as more human behaviors are being measured, the line between the quantitative and the qualitative has blurred. I admire Brian Burke, who led the U.S. men’s hockey team on an Olympic run in 2010 and who has been an outspoken advocate for gay-rights causes in sports. But Burke said something on the hockey analytics panel at the MIT Sloan Sports Analytics Conference last month that I took issue with. He expressed concern that statistics couldn’t measure a hockey player’s perseverance. For instance, he asked, would one of his forwards retain control of the puck when Zdeno Chara, the Boston Bruins’ intimidating 6’9″ defenseman, was bearing down on him?

The thing is, this is something you could measure. You could watch video of all Bruins games and record how often different forwards kept control of the puck. Soon, the NHL may install motion-tracking cameras in its arenas, as other sports leagues have done, creating a record of each player’s x- and y-coordinates throughout the game and making this data collection process much easier.

I would ask a lot of questions of this data if I had it. For instance:

Is it smart for a player to keep control of the puck when Chara (or a similarly gifted defensemen) has him in his sights? Might the player yield fewer turnovers if he passed the puck instead?
Would measuring a player’s perseverance give us meaningful information beyond what is reflected in “box score” statistics, such as goals, assists and plus-minus?
Do players who persevere under threat match those who are regarded as “tough” or as having lot of “heart” by coaches, scouts and commentators? If not, is the metric flawed, or are the coaches biased?

The quality of hockey statistics is fairly poor compared to those for baseball or basketball, so I can understand Burke’s skepticism. But often, general managers and CEOs and op-ed columnists use the lack of data as an excuse to avoid having to examine their premises.

At other times, commentators cite statistics even as they decry their uselessness. Peggy Noonan, the Wall Street Journal columnist, wrote a blog post on the eve of the 2012 election that critiqued those of us who were “too busy looking at data on paper instead of what’s in front of us.” Instead, “all the vibrations” were right for a Romney victory, she wrote.

Among other things, Noonan cited the number of Romney yard signs, and the number of people at his rallies, as evidence that he was bound to win. But these “vibrations” are, in fact, quantifiable. You could hire a team of stringers to drive around randomly selected neighborhoods in swing states and count the yard signs. And news accounts routinely estimate the number of attendees at political rallies. Noonan could have formulated a testable hypothesis: Do yard signs predict election outcomes better than polls do?

The problem is not the failure to cite quantitative evidence. It’s doing so in a way that can be anecdotal and ad-hoc, rather than rigorous and empirical, and failing to ask the right questions of the data.

In contrast, plenty of journalistic methods are rigorous without being quantitative. Investigative reporting, which synthesizes multiple threads of evidence to find the truth behind a story, would fall into this category. Explanatory journalism, including the new ventures launched by Ezra Klein at Vox Media and David Leonhardt at The New York Times, will often do so. Masterful works of history and biography, from Robert Caro to Richard Ben Cramer, also belong here.

However, I acknowledge some problems with the chart and its two-dimensional conception of journalism. For one, in its aversion to anecdotal evidence, this analysis is too dismissive of the important role that journalists play in uncovering new information.

You may have heard the phrase the plural of anecdote is not data. It turns out that this is a misquote. The original aphorism, by the political scientist Ray Wolfinger, was just the opposite: The plural of anecdote is data.

Wolfinger’s formulation makes sense: Data does not have a virgin birth. It comes to us from somewhere. Someone set up a procedure to collect and record it. Sometimes this person is a scientist, but she also could be a journalist.

Take, for example, endorsements made by elected officials (say, senators and governors) during the presidential nomination process. Headlines such as “Arizona Gov. Jan Brewer endorses Mitt Romney” might seem like just the sort of slow-news-day story that journalists make too much of. Indeed, any single endorsement is unlikely to make much difference. It turns out, however, that the sum total of these endorsements has quite a bit of predictive power. A team of political scientists, in their book “The Party Decides,” found that intra-party endorsements often out-predict the polls in the nomination process, especially in the early going.⁵

So perhaps we should think more carefully about the process by which anecdote is transformed into data and information. We might break it down into four rough steps:

The first step is the collection of data or evidence. For a traditional journalist, this is likely to involve some combination of interviewing, documentary research and first-person observation. But data journalists also have ways of collecting information, such as by commissioning polls, performing experiments or scraping data from websites.

The next step is organization. Traditional journalists have a well-established means of organizing information: They formulate a news story. The story might proceed chronologically, in order of importance (the inverted pyramid) or in some other fashion. Data journalists, meanwhile, can organize information by running descriptive statistics on it, by placing it into a relational database or by building a data visualization from it. Whether or not a picture is worth a thousand words, there is value in these approaches both as additional modes of storytelling and as foundations for further analysis.

The third step is explanation. In journalistic terms, this might mean going beyond the who, what, where and when questions to those of why and how. In traditional journalism, stories of this nature are sometimes referred to as “news analysis” or “explanatory journalism.” Data journalists, again, have their own set of techniques — principally running various types of statistical tests to look for relationships in the data.

Let’s pause here for a moment. Up through the first two steps, traditional journalists looked very good. The original reporting they do is tremendously valuable. Besides, most of us learn by metaphors and stories. So traditional journalism’s method of organizing information into stories has a lot of appeal when news happens.

By the third stage, however, traditional journalism has begun to produce uneven results — at least in my view. Take the best-selling book “Double Down” by Mark Halperin and John Heilemann. It contains a lot of original and extremely valuable reporting on the 2012 campaign. Its prose style doesn’t match mine, but it’s a crisp and compelling read. But Halperin and Heilemann largely fail at explaining how Barack Obama won re-election, or why the campaign unfolded as it did.

For example, they cite three factors they say were responsible for Mitt Romney’s decline in the polls in early mid-September: the comparatively inferior Republican convention, Romney’s response to the attacks in Benghazi, Libya, and Romney’s gaffe-filled trip to London. In fact, only one of these events had any real effect on the polls: the conventions, which often swing polls in one direction or another. (This does not require any advanced analysis — it’s obvious by looking at the polls immediately before and after each event.)

Explanation is more difficult than description, especially if one demands some understanding of causality.⁶ It’s something every field struggles with; there are lots and lots of wrongheaded statistical analyses, for instance.

Still, there are some handicaps that conventional journalism faces when it seeks to move beyond reporting on the news to explaining it. One problem is the notion of “objectivity” as it’s applied in traditional newsrooms, where it’s often taken to be synonymous with neutrality or nonpartisanship. I prefer the scientific definition of objectivity, where it means something closer to the truth beyond our (inherently subjective) perceptions. Leave that aside for now, however. The journalistic notion of objectivity, however flawed, at least creates some standard by which facts are introduced and presented to readers.

But while individual facts are rigorously scrutinized and checked for accuracy in traditional newsrooms, attempts to infer causality sometimes are not, even when they are eminently falsifiable. (The increased speed of the news-gathering process no doubt makes this problem worse.⁷) Instead, while the first two steps of the process (collecting and organizing information in the form of news stories) are thought to fall within the province of “objective” journalism, explanatory journalism is sometimes placed in the category of “opinion journalism.” My disdain for opinion journalism (such as in the form of op-ed columns) is well established, but my chief problem with it is that it doesn’t seem to abide by the standards of either journalistic or scientific objectivity. Sometimes it doesn’t seem to abide by any standard at all.

A more data-centric approach is perhaps most helpful, however, when it comes to the fourth step, generalization.

Suppose you did have a credible explanation of why the 2012 election, or the 2014 Super Bowl, or the War of 1812, unfolded as it did. How much does this tell you about how elections or football games or wars play out in general, under circumstances that are similar in some ways but different in other ways?

These are hard questions. No matter how well you understand a discrete event, it can be difficult to tell how much of it was unique to the circumstances, and how many of its lessons are generalizable into principles. But data journalism at least has some coherent methods of generalization. They are borrowed from the scientific method. Generalization is a fundamental concern of science, and it’s achieved by verifying hypotheses through predictions or repeated experiments.

As my book describes, predictions in the sciences (especially the social sciences) are often fairly poor. They usually get better after repeated trials and iterations. But they require a lot of work. One of our sports journalists, Benjamin Morris, suggests that you have almost no hope of beating Las Vegas unless you’ve spent at least 100 hours studying the betting line in question. I can imagine a few exceptions, but it’s a wise rule of thumb.

By contrast, in conventional journalism, predictions are often treated as a parlor game, involving little effort and less accountability. (A variety of studies on the predictions made by McLaughlin Group panelists, for instance, find that they are no more accurate than random guesses.) Predictions are usually outsourced to opinion journalists, who may have less subject-matter knowledge than reporters do.

To reiterate: It’s tough to make predictions, especially about the future. But one of the potential advantages of data journalism is that it generalizes better than traditional approaches, particularly as data sets increase in scale to become larger and more complex. Let me explain this by means of a metaphor.

The books in my office — I have about 500 — are arranged by color. It’s quite aesthetically pleasing. It’s not all that convenient, however, when I have to track down a book. I have to remember its color, or I have to scan through every row and column of the shelf. The color-coding system is perhaps a little better as an organizational method than shelving the books at random, but not a lot better. Still, with 500 books, it’s a manageable problem. In the worst case, I might spend a few minutes looking for a book. I’m willing to make that trade in exchange for having a prettier bookshelf.

But what if instead of having 500 books, I had 50,000, or 50 million? At that point, I’d need a more rigorous approach toward classifying the books — alphabetical order, or the Dewey decimal system, or whatever else. Otherwise, I might spend hours trying to find my copy of “What It Takes.”

The point is that there’s a trade-off between vividness and scalability. Narrative accounts of individual news events can be informative and pleasurable to read, and they can have a lot of intrinsic value whether or not they reveal some larger truth. But it can be extraordinarily hard to make generalizations about news events unless you stop to classify their most essential details according to some numbering or ordering system, turning anecdote into data.

By implication, one of the challenges that FiveThirtyEight faces is figuring out how to make data journalism vivid and accessible to a broad audience without sacrificing rigor and accuracy. We have several strategies for this; understanding which ones will work is going to require some experimentation.

One problem with the four-stage process I outlined above is that it implies these steps proceed in a linear fashion, when that isn’t always the case. Often, a failure of prediction or explanation will compel you to retreat to an earlier stage and collect more data or organize it in a better way. By contrast, the most problematic news stories are often those that leap ahead in the process, drawing grand conclusions from thin evidence.

We’re not planning to abandon the story form at FiveThirtyEight. In fact, sometimes our stories will highlight individual cases, anecdotes. When we provide these examples, however, we want to be sure that we’ve contextualized them in the right way. Sometimes it can be extraordinarily valuable to explore an outlier in some detail. But the premise of the story should be to explain why the outlier is an outlier, rather than indicating some broader trend. To classify these stories appropriately, we’ll have to do a lot of work in the background before we publish them.

All of this takes time. That’s why we’ve elected to sacrifice something else as opposed to accuracy or accessibility. The sacrifice is speed — we’re rarely going to be the first organization to break news or to comment on a story. We’ve hired an extraordinary team of editors, led by Mike Wilson. In contrast to our writers, our editors largely do not have quantitative backgrounds. Instead, they will serve as the first (and second and third) line of defense to ensure that our coverage is both accurate and accessible. Where we do react more quickly, such as on DataLab, our blog-like product led by Mona Chalabi and Micah Cohen, we’re going to label our analysis as work in progress.

We are going to screw some things up. We hope our mistakes will be honest ones. We hope you’ll gain insight and pleasure from our approach to the news and that you’ll visit us from time to time. We hope to demonstrate the value of data journalism as a practical and sustainable proposition.

It’s time for us to start making the news a little nerdier.

Footnotes

There was more value, in my view, in the coverage provided by FiveThirtyEight and other teams of data journalists in the weeks and months leading up to the 2012 election. It’s not accurate to say that our forecasts never wavered: President Obama’s probability of winning re-election fell to as low as about 60 percent at various points, and the favored candidate in Ohio, Florida, Colorado and Virginia flipped back and forth at times. Nonetheless, the relative steadiness of the forecasts stood in contrast to the sometimes breathless media coverage, which was quick to proclaim every dubious poll or every minor campaign stumble a “game changer.” Our model, and others like it, served as a more disciplined way to evaluate the evidence and to put the polls into context, alongside other factors like the state of the economy and past voting patterns in each state.

But our forecasts, which are probabilistic in nature, could very easily have been wrong. Soon enough, in 2016 or 2020 or 2024, the underdog will prevail, just as there will be plenty of upsets in the NCAA basketball tournament later this week. The night our election forecasts are wrong will be a bad night for us.
“Fox No. 9” was how the little guy was labeled on the sheet our logo designer, Michael Meyers, gave to our creative director, Kate Elazegui.
The reason for the scare quotes around “traditional” journalism here is that the practices and values associated with the term are principally those of American newspaper and network television newsrooms in the mid-to-late 20th century. Other journalistic norms have existed at other places and times, including those that viewed journalism as more equivalent to social science (a perspective the American journalist Walter Lippmann advocated in 1919). I sometimes prefer the term “conventional journalism” to describe the practices of 20th-century American newsrooms, although I will use that term and “traditional journalism” (without scare quotes) interchangeably throughout this essay. See Michael Shudson’s “Discovering the News” for more.
The highest-paying college majors are strongly associated with scientific and mathematical rigor; one recent thesis found that every point gained on the mathematics portion of the SAT produces three times the return in salary as compared to the same point gained on the verbal portion of the test.
The nomination process is a long slog, and party officials have many ways of influencing voters and delegates toward their preferred candidates. Candidates who lack their support can surge in the polls, but their success is often ephemeral, as it was for Newt Gingrich and other candidates in 2012.
Note, however, that “difficult” is not the same thing as “valuable.” Solving a Rubik’s Cube is difficult, but not particularly valuable to the world.
I realize it might not sound like it, but my sympathy for traditional reporters has increased a lot since we’ve started to build our own news organization at FiveThirtyEight. The process has made clearer to me that a reporter’s output can reflect the values of her news organization as much as her own. If reporters are under pressure to file stories that get more traffic or “drive the conversation,” they may be encouraged to draw too many conclusions, even when their reported facts are strong and valuable.

Footnotes

Comments