Data never has a virgin birth. It can be tempting to assume that the information contained in a spreadsheet or a database is pure or clean or beyond reproach. But this is almost never the case. All data is collected and compiled by someone — either an individual researcher or a government agency or a scientific laboratory or a news organization or someone or something else. Sometimes, the data collection process is automated or programmatic. But that automation process is initiated by human beings who write code or programs or algorithms; those programs can have bugs, which will be faithfully replicated by the computers.
This is another way of saying that almost all data is subject to human error. It’s important both to reduce the error rate and to develop methods that are more robust to the presence of error.1 And it’s important to keep expectations in check when a controversy like the one surrounding the French economist Thomas Piketty arises.
Piketty’s 696-page book “Capital in the Twenty-First Century” has become an unlikely best-seller in the United States. That’s perhaps because it was published at a time when there is rapidly increasing interest in the subject of economic inequality in the U.S.2 But on Friday, the Financial Times’ Chris Giles published a list of apparent errors and methodological questions in the data underpinning Piketty’s work. Piketty has so far responded to the Financial Times only in general terms.
My goal here is not to litigate the individual claims made by Giles; see The New York Times’ Neil Irwin or The Economist’s Ryan Avent for more detail on that. Rather, I hope to provide some broad perspective about data collection, publication and analysis. A series of disclosures: First, my economic priors and preferences are closer to The Economist’s than to Piketty’s.3 Second, I haven’t finished Piketty’s book, although I’ve spent some time exploring his data. Third, I’m no expert on macroeconomic policy or macroeconomic data. Fourth, this comment rather liberally takes advantage of our footnote system; there’s a short version (sans footnotes) and a long version (avec).
My perspective is that of someone who has spent a lot of time compiling and analyzing moderately complex data sets of different kinds. Also, I’m someone who, like Piketty, has seen his public profile grow unexpectedly in recent years. I consider myself extremely fortunate for this — however, I know that attention can sometimes yield disproportionate praise and criticism. Throat-clearing aside, here’s what I have to offer.
Piketty’s data sets are very detailed, and they aggregate data from many original sources. For instance, the data Piketty and the economist Gabriel Zucman compiled on wealth inequality in the United Kingdom for their paper “Capital is Back: Wealth-Income Ratios in Rich Countries, 1700-2010″ contains about 220 data series for the U.K. alone which are hard-coded into their spreadsheet. These data series are compiled from a wide array of original sources, which are reasonably well documented in the spreadsheet.
This type of data-collection exercise — many different data series over many different years, compiled from many countries and many sources — offers many opportunities for error. Part of the reason Piketty’s efforts are potentially valuable is because data on wealth inequality is lacking. But that also means his numbers will not have received as much scrutiny as other data sets.
An extreme contrast would be to something like Major League Baseball statistics, almost every detail of which have been scrubbed and scrutinized by enthusiasts for decades. Even so, they contain errors from time to time. There are, however, usually larger gains to be had when data or methods or findings are relatively new — as they are in Piketty’s case. (An analogy is the way a vacuum’s first sweep of the living-room floor picks up a lot more dust and dirt than the second and third attempts.) Perhaps Piketty is guilty of coming to some fairly grand conclusions based on data that has not yet received all that much scrutiny.
What error rate is acceptable? The right answer is probably not “zero.” If researchers kept scrubbing data until it were perfect, they’d never have time for analysis. There comes a point of diminishing returns; that Hack Wilson had 191 RBIs during the 1930 season rather than 190 ought not have a material impact on any analysis of baseball player performance. At other times, entire articles or analyses or theories or paradigms are developed on the basis of deeply flawed data.
I don’t know where Piketty sits on this spectrum. However, I think Giles (and some of the commentary surrounding his work) could do a better job of describing Piketty’s error rate relative to the overall volume of data that was examined. If Giles scrutinized all of Piketty’s data and found a handful of errors, that would be very different from taking a small subsample of that data and finding it rife with mistakes.
All of this is part of the peer-review process. Academics sometimes think of peer review as a relatively specific activity undertaken by other academics before academic papers or journal articles are published. This process of peer review has been much studied over the years (often in peer-reviewed articles, naturally), and scholars have come to different conclusions about how effective it is in avoiding various types of errors in published research.
I’m not necessarily opposed to this type of peer review. But I think it defines peer review too narrowly and confines it too much to the academy. Peer review, to my mind, should be thought of as a continuous process: It starts from the moment a researcher first describes her result to a colleague over coffee and it never ends, even after her work has been published in a peer-reviewed journal (or a best-selling book). Many findings are contradicted or even retracted years after being published, and replication rates for peer-reviewed academic studies across a variety of disciplines are disturbingly low.
I have a dog in this fight, obviously. I think journalistic organizations from the Financial Times to FiveThirtyEight should be thought of as prospective participants in the peer-review process, meaning both that we provide peer review and that our work is subject to peer review.
I can’t speak for the FT, but I know that FiveThirtyEight gets some things badly wrong from time to time. It’s helpful to have readers who hold us to a very high standard. (A terrific question is whether FiveThirtyEight and other news organizations are transparent enough about their research to be full-fledged participants in the peer-review process. That’s something I should probably address more completely in a separate post, but see the footnotes for some discussion about it.4)
Piketty’s errors would not have been detected so soon had he not published his data in detail. That’s not to say that transparency is an absolute defense.5 But one should also assume that there are as many problems (probably more) with unpublished data, or poorly explained methods.6
The peer-review process ideally involves both exactly replicating a research finding and replicating it in principle. It would be problematic if other researchers couldn’t duplicate Piketty’s data. But it would be at least as problematic — I’d argue more so — if they could replicate it but found that Piketty’s conclusions were not very robust to changes in assumptions or data sources.
Some of Giles’s critique of Piketty gets at this problem. For instance, he calls into question Piketty’s finding that wealth inequality is rising throughout Western Europe, a result which he says depends on a particular series of assumptions and choices that Piketty made.7
Of course, Giles’s methodological choices can be scrutinized, too. Perhaps there’s some reasonable set of assumptions under which wealth inequality is not rising at all in Western Europe, another under which it’s increasing modestly, and a third under which it’s increasing substantially.
In the medium term, the better test might be one of research that’s built up from scratch and largely independently of both Piketty and Giles. How robust are their findings to reasonable changes in data and assumptions?8
And in the long run, the best test might be whether Piketty’s hypothesis makes a good prediction about wealth inequality, i.e. whether wealth inequality continues to rise. The prediction won’t be as easy to evaluate as election forecasts are.9 Still, Piketty’s book comes closer to making a testable prediction than much other macroeconomic work.
Science is messy, and the social sciences are messier than the hard sciences. Research findings based on relatively new and novel data sets (like Piketty’s) are subject to one set of problems — the data itself will have been less well scrutinized and is more likely to contain errors, small and large. Research on well-worn datasets are subject to another. Such data is probably in better shape, but if researchers are coming to some new and novel conclusions from it, that may reflect some flaw in their interpretation or analysis.
The closest thing to a solution is to remain appropriately skeptical, perhaps especially when the research finding is agreeable to you. A lot of apparently damning critiques prove to be less so when you assume from the start that data analysis and empirical research, like other forms of intellectual endeavor, are not free from human error. Nonetheless, once the dust settles, it seems likely that both Piketty and Giles will have moved us toward an improved understanding of wealth inequality and its implications.