Project Overview
This portion of the COVID DIARIES project provides full bibliographic information (including original and permanent links) to media items related to the COVID-19 vaccination program, published on the official websites of 20 major U.S. news outlets, including television networks, magazines, and newspapers. It spans the period from December 2020, when states began implementing Phase 1a of the vaccine allocation plan, through September 2021, when vaccines became widely available to all adults and were frequently mandated.
News items were collected to preserve a contemporaneous record of how the vaccination effort was discussed across national media. The dataset enables researchers to analyze media communication strategies during a nationwide public health emergency, with the broader aim of informing more effective public health messaging through mass media.
This project represents a collaborative effort between the Yale School of Medicine and the Tobin Center for Economic Policy.
Data and Data Collection Overview
This collection comprises 5,383 unique publication links from 20 major news outlets—including television networks, magazines, and newspapers—published between December 1, 2020, and September 30, 2021. Only articles that were freely accessible online without subscription or paywall restrictions were included. Articles were collected by the research team (specifically AM) between August 2021 and November 2023. These 20 news outlets were selected based on a 2020–2021 survey of 511 U.S. adults, which identified the outlets most commonly used to obtain information about the COVID-19 vaccination program. A full list of news outlets, along with their reported usage and perceived trustworthiness, is provided in Sources_Selection.docx.
Online publications were identified using Google search with a custom date range in week-long increments (e.g., 12/01/2020–12/07/2020), using the keyword “vaccine” in combination with the link to the respective news outlet’s website. Search results were manually reviewed by AM according to the following inclusion and exclusion criteria.
Inclusion criteria:
- Articles published on the selected U.S. news outlets websites ending in “.com” or “.co” that relate to the COVID-19 vaccination program;
- Articles from the selected international news outlets that serve both their country of origin and the U.S. audience (e.g., BBC, The Daily Mail).
Exclusion criteria:
- Articles published on the international news outlets websites that exclusively serve their country of origin (e.g., domains ending in .uk, .ca, etc. without .com, .co);
- Publications from universities, government agencies, or other organizations not affiliated with major U.S. news outlets (e.g., domains ending in .edu, .gov, .org);
- Videos without accompanying transcripts;
- Publications without textual content;
- Articles referencing vaccines unrelated to COVID-19;
- Non-English language publications.
Selection and Organization of Shared Data
The full list of publications is provided in the data file named "News_Outlets_Publications_Full_List." Entries are organized by news outlet (one per tab), then by publication year, month, week, and article title within each tab.
For each entry, the list includes the article’s original download date by the research team, file format (e.g., PDF), original link to the publication, and a permanent link record.
The list was verified by MC, CA, AV, AG, and AM, with final quality control performed by AM. Each article was assigned a unique identifier in the format: "Article Title – News Outlet Name", ensuring that each entry appears only once in the final dataset.
Additional documentation includes this Data Narrative, a document explaining the source selection and an administrative README file. |