Skip to main content

Global analysis of environmental and socioeconomic factors associated with human burden of environmentally mediated pathogens

Cite this dataset

Sokolow, Susanne et al. (2022). Global analysis of environmental and socioeconomic factors associated with human burden of environmentally mediated pathogens [Dataset]. Dryad.


This repository contains four datasets that support repeatability of the analyses in the Sokolow et al. paper published in Lancet Planetary Health. Descriptions of the four datasets are included in the metadata document. This study found that 80% of pathogen species known to infect humans are environmentally mediated, causing about 40% of contemporary infectious-disease burden (global loss of 130 million years of healthy life annually). More than 91% of this environmentally-mediated disease burden occurs in tropical countries, and the poorest countries carry the highest burdens across all latitudes. There were weak associations between disease burden and biodiversity or agricultural land use at the global scale. In contrast, the proportion of people with rural poor livelihoods in a country was a strong proximate indicator of environmentally mediated infectious disease burden there. Political stability and wealth were associated with improved sanitation, better health care, and lower proportions of rural poverty, indirectly resulting in lower burdens of environmentally mediated infections."


We started with a full list of 560 named pathogens (from 197 genera) associated with the World Health Organization’s tracked pathogens within category I.A: “Communicable, maternal, perinatal and nutritional conditions: Infectious and parasitic diseases” of the World Health Organization’s Global Burden of Disease Estimates from 2015. Next, to account for potential biases that result from the selection of pathogens that the WHO tracks, we also examined the transmission strategies of a random subsample of 250 pathogens (using a random number generator to select pathogens from the full list of 1415 described human pathogen species) compiled by Taylor et al. in 2001, which is dominated by rare, opportunistic pathogens. By chance, 87 pathogens (from 57 genera) ended up in both the WHO and Taylor subsets (<15% overlap) with the remaining pathogens unique to each list. Across both datasets, we assessed 723 unique human pathogen species (from 292 genera) in total.

To quantify the global burden caused by environmentally mediated infections, we examined the WHO Global Health Estimates data for 2015. This dataset measures disease burden in all countries around the world using disability adjusted life years (DALYs), a standard metric for measuring the impact of disease on human well-being. DALYs are calculated as the sum of “years of life lost due to mortality” and “years of healthy life lost due to disability.” Burden data were available for a subset of 153 WHO pathogens, categorized into 51 tracked disease categories.

Usage notes

description of data reported in README_Metadata_for_Sokolow_et_al_LPH.rtf

All the other files are csv format, namely:

- 723Pathogens_EMDorDTD_Sokolowetal.csv

- SEMAnalysis_CountryLevel_Sokolowetal.csv

- WHO_GHE_2000_2015.csv

- WHO_GHEData_2015_Country.csv


Bill & Melinda Gates Foundation, Award: OPP1114050

National Science Foundation, Award: DEB – 2011179

National Science Foundation, Award: ICER-2024383 (Belmont Collaborative Forum on Climate, Environment and Health)

National Institutes of Health

Alfred P. Sloan Foundation

Program for Disease Ecology, Health and the Environment, Stanford university


Science for Nature & People Partnership (SNAPP) - NCEAS