A very special ceremony, honoring the year’s most interesting people and stories from the world of data. Presented by What’s The Point‘s Jody Avirgan and the FiveThirtyEight staff.
Most Insidious Manipulation of Data
“The lesson here is just because it’s data doesn’t mean it’s reliable. Explore the provenance of your data.”
It’s been a dubious banner year for lies, damn lies and data. And it probably comes as little surprise that our award for Most Insidious Manipulation of Data goes to Volkswagen. The German car company installed illegal “defeat devices” on its vehicles to skirt regulations in both Europe and the U.S. These devices sensed when a vehicle was in emissions-testing mode and altered the car’s performance to keep emissions below allowable levels. They were, of course, caught, and the fallout has been as dramatic as the con. More than 8 million vehicles in Europe were recalled, and a fix could take up to 10 hours per car. It’ll cost Volkswagen upwards of $2 billion, and the company posted its first quarterly loss in more than a decade.
Honorable mention in this category goes to both DraftKings and FanDuel, the rival daily fantasy sports sites. Among other legal entanglements, both have been accused of being embroiled in a kind of insider trading in which company employees used proprietary data to gain an edge on civilian players. Many of the biggest winners on one of those sites were employed by the other site. — Oliver Roeder
More from FiveThirtyEight: The Volkswagen scandal led to a number of lawsuits.
The Stephen Colbert Memorial Award in Distinction of Truthiness
“I hope this creates a lesson for science. There’s no way you can replicate every study that’s published, but if there is a study that seems almost too heartwarming and too good to be true, maybe it is worth it to spend the time trying to replicate it.”
There were many contenders for the Stephen Colbert Memorial Award in Distinction of Truthiness (“memorial” because Colbert is now playing himself and not “Stephen Colbert”). But Michael LaCour won in a crowded field. The former UCLA graduate student made international headlines — and transformed political strategy — with his apparent finding that when gay people canvass on behalf of same-sex marriage, they can change people’s minds simply by talking to them. Turns out, though, that his study was fatally flawed — and possibly faked. (We contacted LaCour before granting him the award but haven’t heard back.) Does that mean science is fatally flawed? My colleague Christie Aschwanden doesn’t think so — and the LaCour episode shows why: David Broockman, who led the research that debunked LaCour, got universal acclaim for the work despite getting little encouragement to pursue it when he first sensed problems.
We also awarded a runner-up for the Colbert award. You’ll have to watch to learn who that was. (Hint: It’s a presidential candidate with an itchy Twitter trigger finger.) — Carl Bialik
More from FiveThirtyEight: How the LaCour scandal shows we’re all vulnerable to fake data.
The Defender of Data
(Elected Official Division)
“At a time when we are seeing so many attacks on government data, it was great to see a government stand up and try to get better data.”
In a year when Congress tried its best to slash funding to the U.S. Census Bureau, we had to look north to find an elected official standing up for the importance of government data. But newly elected Canadian Prime Minister Justin Trudeau is a worthy champion. For those who don’t follow the twists and turns of Canadian survey data, a quick recap: Back in 2010, then-Prime Minister Stephen Harper killed the country’s mandatory long-form census (which asked about topics such as income, education and employment) and replaced it with a voluntary survey. Response rates collapsed, and data quality went with them. But in one of its first acts upon taking power in November, Trudeau’s Liberal government reinstated the long-form census for next year. Data-lovers cheered. — Ben Casselman
The Dumbest Data
(That We Definitely Needed)
“The premise they all fit into is the basic question, ‘Am I normal?’ ”
Many of our Data Awards recognize important, rigorous data sets or the academics and experienced analysts who worked to expose misleading data or enlighten us with good data. But sometimes we want answers to things we’re afraid to ask, the things only your Google search history would expose. Those are the things that bring us a little closer to understanding where we fit in this strange world. So this award goes to our former colleague Mona Chalabi, who during her time at FiveThirtyEight wrote the “Dear Mona” column, which answered those odd but telling questions. She answered 47 “Dear Mona” questions in all, but these three from the past year really stood out to me as the best examples of dumb data (that we definitely needed!).
- What percentage of marriages in the U.S. are between first cousins?
- What is the total number of faces and names an average person can remember?
- Do hospitals experience a larger number of emergency room visits during full moons?
— Allison McCann
More from FiveThirtyEight: The complete archive of “Dear Mona” columns.
Best Data Collection the Government Isn’t Doing (Yet) but Should Be
“What’s interesting about what The Guardian and The Washington Post were able to do is show that it’s maybe not that hard.”
More than 1,000 people have been killed by police officers in the United States this year, but you won’t see that figure in any official statistics — at least not for some time. Even though a 1994 law requires the federal government to track police use of force, the FBI’s method, which focuses on so-called justifiable homicides, significantly undercounts police killings. We know this because of the diligent work of two news organizations, The Guardian and The Washington Post. Each collected data on police killings and covered the topic intensely this year, giving the public a more accurate picture of who dies at the hands of police. Both efforts were built on earlier work by crowdsourced websites, Fatal Encounters and Killed By Police, and introduced further reporting and analysis. They even embarrassed the FBI into collecting better data: The agency now plans to expand its own database of police use of force. Its goal is to publish more reliable statistics by the end of next year. — Simone Landon
The Hack Most Likely to Be a Hollywood Movie Plot
“If there’s a lesson here, it’s that if you’re going to make fun of somebody famous, do not do it in print.”
Although the hack and immediate fallout technically happened around this time last year, some of the more interesting and occasionally even positive repercussions of the hack went down in 2015. By far the biggest occurred when Jennifer Lawrence went public about her and her co-stars’ compensation for the film “American Hustle,” shedding light on Hollywood sexism in the process. Even though she was a household name after “Hunger Games” and was one of the most famous people in the film, Lawrence and her female co-stars were paid less than the guy who played Hawkeye. This prompted her to speak out about how women are poorly compensated in the industry, drawing attention to the problems of underrepresentation, poor availability of good roles, and wrongheaded ideas the industry has about gender and money. — Walt Hickey
Best Reminder That Science Is Hard
“There’s this growing worry that some of the results that are being published may not be as reliable as we thought.”
Science is hard. Even when studies are done by earnest researchers with the best intentions, they may not produce results that are replicable or definitive. In 2011, researchers at the Open Science Collaboration began a large-scale project to quantify the problem by testing the reproducibility of psychology research. Teams of researchers selected 100 studies published in three high-profile psychology journals during 2008 and, with guidance and input from the original authors, attempted to replicate these studies. The results, published this year, were sobering: Only 36 percent of the replication studies succeeded in reproducing the original results. Although the project clearly shows that psychology research has a reproducibility problem, it also took a first step toward addressing it by investigating the scope of the problem and identifying factors that contribute to it. The Open Science Collaboration has formed an open data network to help researchers conduct studies in a transparent way that will make it easier for other scientists to replicate and further other researchers’ findings. — Christie Aschwanden
The Bill de Blasio Award for Best Urban Planning Throwdown
“We dug through and found that Uber is adding tons and tons of pickups. But in the core of Manhattan, where this fight was about, they weren’t.”
Uber aggressively expanded in the New York City market in 2015, fueling fears that the ride-for-hire cars were causing urban gridlock and prompting Mayor Bill de Blasio to call for limits on the size of Uber’s fleet. But has Uber contributed to Manhattan congestion? The answer (with plenty of caveats) seems to be no. Data we received through a Freedom of Information Law request showed that Uber gained millions of pickups between 2014 and 2015, while taxis lost a near identical number. The city has backed off its regulation plans for now, but a new throwdown with Airbnb — another regulation-skirting Bay Area transplant — looms. — Reuben Fischer-Baum
More from FiveThirtyEight: Our series on how Uber is affecting taxis and traffic in New York City.
The Most Intriguing Sports Wearable, aka the Grossest Misapplication of NSA-Level Technology
“In the past, the guy keeping score on the sidelines would mark down events as best he could, but they weren’t tracking the essence of the game.”
2015 was the year that player-tracking technology crossed over into the mainstream of sports. NFL players were outfitted with RFID chips; every MLB ballpark provided a full season of Statcast’s radar and video data; and the NHL tested similar tracking equipment at its All-Star Game. (The NBA presumably mocked the latecomers and reminded them it was into video-tracking data before it was cool.) We wish we could split the award between all the leagues, but we ultimately chose the NFL because their tech was wearable and because of its potential to help reduce player concussions in practice. — Neil Paine
More from FiveThirtyEight: Hot Takedown — our sports podcast — discusses the arrival of RFID in the NFL.
The Data Set That Keeps On Giving
“I myself am technically an Ashley Madison user.”
Ashley Madison — a dating website whose tagline is “Life is short. Have an affair.” — endured a massive data breach in July by hackers calling themselves the Impact Team. That breach resulted in a data dump that included tens of millions of email addresses and millions of transaction records, some including credit card numbers. The hackers made a point of noting that the company never actually scrubbed consumer information from its site even though it charged for the privilege of doing so.
Who was in the data set released by hackers (which was publicly posted and searchable)? There were actual users; people, like me, whose email addresses were in the set even though they weren’t users; 70,000 bots — or accounts with no human user on the backend — most designed to seem like women and, presumably, entice more men in; and 15,000 addresses with .mil or .gov government extensions. Ashley Madison and its parent company, meanwhile, are still alive. — Farai Chideya
More from FiveThirtyEight: Trying to understand Ashley Madison’s users.
Part 1 of the podcast available now on iTunes and other apps. Part 2 coming Dec. 24.