Skip to main content

Socioeconomic disparities in subway use and COVID-19 outcomes in New York City

Cite this dataset

Sy, Karla Therese L.; Martinez, Micaela E.; Rader, Benjamin; White, Laura F. (2021). Socioeconomic disparities in subway use and COVID-19 outcomes in New York City [Dataset]. Dryad.


Using data from New York City, we found that there was an estimated 28-day lag between the onset of reduced subway use and the end of the exponential growth period of SARS-CoV-2 within New York City boroughs. We also conducted a cross-sectional analysis of the associations between human mobility (i.e., subway ridership), sociodemographic factors, and COVID-19 incidence as of April 26, 2020. Areas with lower median income, a greater percentage of individuals who identify as non-white and/or Hispanic/Latino, a greater percentage of essential workers, and a greater percentage of healthcare essential workers had greater mobility during the pandemic. When adjusted for the percent of essential workers, these associations do not remain, suggesting essential work drives human movement in these areas. Increased mobility and all sociodemographic variables (except percent older than 75 years old and percent of healthcare essential workers) was associated with a higher rate of COVID-19 cases per 100k, when adjusted for testing effort. Our study demonstrates that the most socially disadvantaged are not only at an increased risk for COVID-19 infection, but lack the privilege to fully engage in social distancing interventions.


Original Data Sources:

1. Weekly Metropolitan Transportation Authority (MTA) New York City transit subway data are publicly availably (Link:

2. New York City Department of Health and Mental Hygiene COVID-19 data are available openly (Link:

Usage notes

There are five data sets:

1. Cross-sectional NYC mobility and COVID-19 data (ZCTA-level)

2. Longitudinal NYC COVID-19 data (Borough-level)

3. Longitudinal NYC mobility data (ZCTA and borough-level)

4. Geographic coordinates for NYC (ZCTA-level)

5. Regression output for Figure 4 (ZCTA-level)


National Science Foundation, Award: 2029421

Tides Foundation, Award: TF2003-089662

National Institute of General Medical Sciences, Award: GM122876

Tides Foundation, Award: TF2003-089662