The Datasets We're Looking At This Week
You’re reading Data Is Plural, a weekly newsletter of useful/curious datasets. Below you’ll find the Dec. 7, 2022, edition, reprinted with permission at FiveThirtyEight. After today’s edition, FiveThirtyEight will no longer be republishing Data Is Plural. However, you can still subscribe to the original newsletter at its website or by entering your email in the subscription box at the end of this article.
Work stoppages, avian flu detections, social media suppression, literature prizewinners and video games.
Work stoppages. The U.S. Bureau of Labor Statistics’ Work Stoppages program collects and publishes data on “major work stoppages involving 1,000 or more workers lasting one full shift or longer.” The program’s main dataset lists each major strike or lockout since 1993, the organizations involved, employer industry and ownership type, start and end dates, number of workers participating and total worker-days idle. Its annual dataset counts the number of major stoppages, workers involved and worker-days idle for each year since 1947. Related: The Federal Mediation and Conciliation Service used to publish data on all stoppages that its mediators entered into its case system (regardless of number of workers involved), but stopped doing so in late 2020. Forest Gregg has rescued and standardized the archived records. [h/t Chartr]
Avian flu detections. The USDA’s Animal and Plant Health Inspection Service has been tracking local cases of avian influenza detected this year in commercial and backyard flocks as well as in wild birds and in mammals. For each infected flock, the agency’s public data indicate the flock’s state, county, producer type, number of birds affected and confirmation date. The mammal and wild bird data indicate the state, county, date detected, flu strain and species. Read more: “More Than 52 Million Birds in the U.S. Are Dead Because of Avian Flu” (Smithsonian Magazine). [h/t Ed Vine]
Social media suppression. Researchers at Surfshark have been monitoring government-imposed social media shutdowns and restrictions, drawing on reports from NetBlocks, AccessNow (DIP 2021.11.03), news publications and other sources. The project’s spreadsheet highlights each case from 2015 to the present; it lists the country, start/end date, particular services affected (Facebook, Twitter, YouTube, etc.) and observed connections to political events (e.g., elections, protests, other turmoil). [h/t Agneska Sablovskaja]
Literature prizewinners. Claire Grossman et al. have compiled a dataset of “the winners and judges of prizes for prose, poetry, or unspecified genre between 1918 and 2020 with a purse of $10,000 and over.” The 7,100-plus entries, shared through the Post45 Data Collective, relate to 50 awards and fellowships, plus the Library of Congress’s poet laureateship. Each entry indicates the prize name, institution, type, genre, year and dollar amount, plus the winner/judge name, gender and educational affiliations. [h/t Melanie Walsh]
Video games. Developer Vladimir Belyaev has constructed a dataset representing “all games available” on Steam, the massively popular video game platform and store. It indicates each game’s name, price, developer, languages, genre categories, ratings from third-party websites and more. [h/t Saul Pwanson]