The Datasets We're Looking At This Week
You’re reading Data Is Plural, a weekly newsletter of useful/curious datasets. Below you’ll find the Sept. 21, 2022, edition, reprinted with permission at FiveThirtyEight.
Labor turnover, biodiversity trends, probabilistic predictions, working artists and Atari emails.
Labor turnover. The monthly Job Openings and Labor Turnover Survey from the U.S. Bureau of Labor Statistics estimates the number of jobs that people quit, how many people were fired or laid off, the number of new hires and the current number of open positions. Those estimates, based on data gathered from a sample of businesses across the country, are available to download and query by state, industry and business size. They include most types of workers, regardless of whether they’re full-time or part-time, permanent or seasonal, salaried or hourly.
Biodiversity trends. Maria Dornelas et al.’s BioTIME project has collected and standardized data from hundreds of studies examining ecological communities over time. You can browse and search the studies by year, taxon, species and biome. You can also download the full dataset, which provides information about each study (biome, start/end years, number of species tallied and much more) and each sample collected (date, location, species, abundance and biomass). As seen in: “Economic Production and Biodiversity in the United States,” by Yuanning Liang et al.
Probabilistic predictions. Metaculus is a forecasting platform whose community has registered more than one million predictions on questions such as “Will a major nuclear power plant in Germany be operational on June 1, 2023?” The website’s API provides data on questions posed, user rankings and other aspects of the platform. For each question, you can see its phrasing, date posed, creator, prediction type, the distribution of predictions and more. Related: Zoltar, a forecast archive assembled by Nicholas G. Reich et al. Previously: FiveThirtyEight’s assessment of its own predictions (DIP 2019.04.10).
Working artists. The National Endowment for the Arts regularly produces statistical profiles of the arts in the United States. The latest, “Artists in the Workforce: National and State Estimates for 2015-2019,” is tabulated from the Census Bureau’s American Community Survey. It provides employment and earning estimates by artistic occupation and demographic. Additional tabulations, including for the country’s 25 largest metro areas, are available through the National Archive of Data on Arts and Culture. [h/t Gary Price]
Atari emails. A couple of decades ago, Jed Margolin posted a cache of electronic mail messages from his time as a video game hardware engineer at Atari (and Atari Games, a successor company). In 2017, with Margolin’s permission, Vikram Oberoi scraped the 4,000-plus emails and built atariemailarchive.org, which groups the messages into threads, categories and a list of favorites. The project also includes a database file containing each message’s sender, recipients, timestamp, subject, body and Oberoi’s thread grouping. Related: “How I made atariemailarchive.org.”
Dataset suggestions? Criticism? Praise? Send feedback to email@example.com. Looking for past datasets? This spreadsheet contains them all. Visit data-is-plural.com to subscribe and to browse past editions.