The Datasets We’re Looking At This Week
You’re reading Data Is Plural, a weekly newsletter of useful/curious datasets. Below you’ll find the June 01, 2022, edition, reprinted with permission at FiveThirtyEight.
School shootings, congressional districts and legislators, wind and solar power, Olympic accounting and “What Middletown Read.”
School shootings, continued. The K-12 School Shooting Database, housed at the Naval Postgraduate School, “documents each and every instance a gun is brandished, is fired, or a bullet hits school property for any reason, regardless of the number of victims, time, day of the week.” It describes 2,000-plus incidents from 1970 to the present and links them to information regarding 3,000-plus victims killed and wounded, 2,200-plus shooters and 2,000-plus weapons. Related: A few years ago, CNN compiled a dataset of 180 school shootings from 2009 to 2018, focusing on incidents where at least one person (not including the shooter) was shot. Previously: Data on school shootings from The Washington Post (DIP 2018.04.25), and on mass shootings from The Violence Project, Gun Violence Archive and Mother Jones (DIP 2021.03.24, DIP 2015.12.09). [h/t Michael A. Rice + Sam Petulla]
Congress, consolidated. CongressData, published last month by political scientists at the Institute for Public Policy and Social Research, “compiles information about all U.S. congressional districts,” the legislators representing them and those legislators’ policymaking behavior (such as committee memberships and number of bills sponsored). The dataset spans 1789-2021, although many of the variables (such as those derived from the Census Bureau’s American Community Survey) are only available for more recent years. [h/t Erik Gahner Larsen]
Wind and solar power. The Global Energy Monitor’s Global Wind Power Tracker is “a worldwide dataset of utility-scale wind facilities,” focusing on those with planned or installed capacities of at least 10 megawatts. It provides each facility’s name, location, status, capacity, installation type, owner and other details. The project launched last week alongside a sibling dataset, the Global Solar Power Tracker. They join a growing collection of trackers from the organization, including those examining coal infrastructure, steel plants and oil and gas resources. [h/t Nathaniel Hoffman]
Olympic accounting. Martin Müller et al. have compiled a dataset of the costs and revenues of three recurring “mega-events”: the Summer Olympic Games, Winter Olympic Games, and FIFA Men’s World Cup. For each event between 1964 and 2018, it indicates the number of athletes, number of accredited media, venue costs, organization costs, ticketing revenue, broadcast revenue and sponsorship revenue.
“What Middletown Read.” Thanks to the discovery of “a collection of dusty ledgers” in 2003, researchers have built a database of (nearly) every checkout from Muncie, Indiana’s public library from November 1891 to December 1902. The project, a collaboration between the library and Ball State University, takes its name from a famous sociological study that pseudonymized Muncie as Middletown. Previously: Seattle Public Library checkouts since 2005 (DIP 2017.03.01). [h/t Matt Brown]
Dataset suggestions? Criticism? Praise? Send feedback to email@example.com. Looking for past datasets? This spreadsheet contains them all. Visit data-is-plural.com to subscribe and to browse past editions.