Data from: Evidence of repeated zoonotic pathogen spillover events at ecological boundaries
Data files
Oct 23, 2024 version files 678.21 KB
-
bat_sp.csv
693 B
-
Ebola_complete_data04052023.csv
670.79 KB
-
prim_sp.csv
1.61 KB
-
README.md
1.82 KB
-
Traits.R
3.29 KB
Abstract
Human encroachment in natural habitats poses a significant threat to humans and wildlife but remains drastically understudied. By forming new ecological boundaries at the human/wildlife interface, anthropogenic modifications to the landscape have altered several ecological processes and disturbed ecological integrity and resilience worldwide. Outbreaks of zoonotic pathogens often occur in human populations at these ecological boundaries, but the mechanisms behind these new emergences remain under investigation. Here, we aim to unravel the roles of both landscape and biotic communities by comparing the characteristics of presence and pseudo-absence localities to provide a better mechanistic understanding of zoonotic disease emergence. Using the Ebola virus as a model, we link pathogen reservoirs and accidental host ranges with human land use using a machine learning framework. Our results show that species range edges and conversion from wildlands to agricultural areas increase Ebola outbreak risk. Moreover, we show clear evidence that species range edges are predominantly composed of agricultural landscape, possibly amplifying pathogen outbreaks. Given the current rate of landscape modification worldwide, we provide novel ecological and evolutionary insights to our understanding of zoonotic pathogen emergence and highlight the risk of aggressively developing ecological boundaries.
Dataset comprises all reported Ebola outbreaks originating from Spillover events (more details in the published manuscript)
Description of file structure
bat_sp
: Minimum distances for all reported outbreak and the edge of bat species ranges, used to create phylogenetic figureprim_sp
: Minimum distances for all reported outbreak and the edge of primate species ranges, used to create phylogenetic figureEbola_complete_data04052023
: Dataset used to run the analysisTrait.R
: R code to test species trait influenceGBM_analysis_Ebola_Vmanucript.R
: R code used to run all the analysis in the manuscript, needs the Ebola complete dataset to work
Description of the data
This dataset has numerous column related to all the variables extracted from spatial analysis, fully described in the text. Here is a breakdown by column types for Ebola_complete_data04052023:
- Number of cases: Number of human cases of Ebola (empty cells represent no case number reported for that outbreak)
- Lat/Long: Coordinates of Outbreak
- Outbreak: 0 represent pseudo-absence datapoints, 1 represents confirmed Ebola outbreak
- L2: Identifier for remerging data, can be tossed out
- From Cropland to Dense_settlements columns: Proportion (from 0-1) of SEDAC land use types
- Nb_lt: How many land types there are
- diversity: Calculated with the vegan package in R, Simpson diversity index for land types
- All of the Min_dist columns: Minimum distance from each outbreak to the range edges for all individual fruit bat and primate species in the IUCN database
For all distance columns, the st_distance call from the sf package returns the value in meters.
Code/Software
Code is provided with this submission, fully annotated
We compiled all reported outbreaks with human cases from Kuhn (2008), the Centers for Disease Control (CDC 2023), and ProMED (Yu and Lawrence 2004). We then pruned the dataset to include only outbreaks confirmed to be Ebola virus (including both Sudan ebolavirus and Zaire ebolavirus) by laboratory testing and that originated from a wild source (e.g., we excluded data points from accidental laboratory spillover events). For each outbreak, we compiled the date of outbreak, its geographical origin, and the number of human cases, allowing us to work with 44 outbreaks ranging from 1976 to 2020, inclusively (and excluding outbreaks reported after we initiated our work).