Trait-based sensitivity of large mammals to a catastrophic tropical cyclone: DNA metabarcoding data
Data files
Nov 29, 2023 version files 23.61 GB
-
adonis_pairwise_apr18.csv
-
adonis_pairwise_apr19.csv
-
adonis_pairwise_jul16.csv
-
adonis_pairwise_jul18.csv
-
adonis_pairwise_jul19.csv
-
adonis_pairwise_nov18.csv
-
adonis_pairwise_nov19.csv
-
African_Wild_Dog_Diet_Data_Filtered.csv
-
African_Wild_Dog_Diet_Data_Raw.fastq.gz
-
Antelope_MCP.zip
-
Herbivore_Diet_Data_EarlyDry_2016_Filtered.txt
-
Herbivore_Diet_Data_EarlyDry_2018_Filtered.txt
-
Herbivore_Diet_Data_EarlyDry_2019_Filtered.txt
-
Herbivore_Diet_Data_EarlyDry_2019_Raw.fastq.gz
-
Herbivore_Diet_Data_LateDry_2018_Filtered.txt
-
Herbivore_Diet_Data_LateDry_2019_Filtered.txt
-
Herbivore_Diet_Data_LateDry_2019_Raw.fastq.gz
-
Herbivore_Diet_Data_LateWet_2018_Filtered.txt
-
Herbivore_Diet_Data_LateWet_2018_Raw.fastq.gz
-
Herbivore_Diet_Data_LateWet_2019_Filtered.txt
-
Herbivore_Diet_Data_LateWet_2019_Raw.fastq.gz
-
Herbivore_Dung_Samples_Positively_Identified.csv
-
Herbivore_Dung_Samples_Unidentified.csv
-
Herbivore_Floodplain_Use.csv
-
Kingdon_bodymass.csv
-
New_Flood.zip
-
png_shp.zip
-
README.md
Abstract
Extreme weather events perturb ecosystems and increasingly threaten biodiversity1. Ecologists emphasize the need to forecast and mitigate the impacts of these incidents, which requires knowledge of how risk is distributed among species and environments, but the scale and unpredictability of extreme events complicates assessment1–4. These challenges are compounded for large animals (‘megafauna’), which play crucial ecological roles but are hard to study5. Traits such as body size, dispersal ability, and habitat affiliation are among the hypothesized determinants of animals’ vulnerability to natural hazards1,6,7. However, it has rarely been possible to test these propositions or, more generally, to link short- and longer-term effects of weather-related disturbance8,9. Here, we show how large herbivores and carnivores in Mozambique responded to Intense Tropical Cyclone Idai, the deadliest storm on record in Africa, across scales ranging from individual decisions in the hours after landfall to community-level responses nearly 20 months later. Animals occupying low-elevation habitats exhibited strong spatial responses to rising floodwaters. Body size predicted species’ subsequent numerical responses: small-bodied species exhibited the greatest population declines. We trace this sensitivity to limited mobility, which increased likelihood of death during the flood and constrained animals’ capacity to withstand food shortages afterward. Our results identify potentially general trait-based mechanisms underlying animal responses to severe weather and may help to inform strategies for wildlife conservation in a volatile climate.
- Climate Change 2022: Impacts, Adaptation and Vulnerability. Contribution of Working Group II to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change [H.-O. Pörtner, D.C. Roberts, M. Tignor, E.S. Poloczanska, K. Mintenbeck, A. Alegría, M. Craig, S. Langsdorf, S. Löschke, V. Möller, A. Okem, B. Rama (eds.)]. Cambridge University Press. Cambridge University Press, Cambridge, UK and New York, NY, USA, (2022).
- Smith, M. An ecological perspective on extreme climatic events: A synthetic definition and framework to guide future research. J. Ecol. 99, 656-663 (2011).
- Ummenhofer, C. C., & Meehl, G. A. Extreme weather and climate events with ecological relevance: a review, Phil. Trans. R. Soc. B. 372, 20160135 (2017).
- Jentsch, A., Kreyling, J., & Beierkuhnlein, C. A new generation of climate-change experiments: events, not trends. Front. Ecol. Environ. 5, 365-374 (2007).
- Pringle, R. M., et. al. Impacts of large herbivores on terrestrial ecosystems. Current Biology 33, R584-R610 (2023).
- Spiller, D. A., Losos, J. B., & Schoener, T. W. Impact of a catastrophic hurricane on island populations. Science 281, 695-697 (1998).
- Schoener, T. W., & Spiller, D. A. Nonsynchronous recovery of community characteristics in island spiders after a catastrophic hurricane. PNAS 103, 2220-2225 (2006).
- Pruitt, N., Little, A. G., Majumdar, S. J., Schoener, T. W., & Fisher, D. N. Call-to-Action: A global consortium for tropical cyclone ecology. TREE 34, 588-590 (2019).
- Lin, T. C., Hogan, J. A., & Chang, C. T. Tropical cyclone ecology: a scale-link perspective. TREE 35, 594-604 (2020).
README: Trait-based sensitivity of large mammals to a catastrophic tropical cyclone: DNA metabarcoding data
https://doi.org/10.5061/dryad.7wm37pvzv
This dataset contains raw and filtered sequence data derived from metabarcoding DNA extracted from the dung of large herbivores and African wild dogs (Lycaon pictus) from Gorongosa National Park, Mozambique. Alongside these diet-data files, there are files matching a sample's original name (i.e., that used in the raw diet data files) with its corrected name (i.e., that used in the filtered data files), which derive from genetically testing the samples. It also contains a file describing the use of the floodplain habitat by large herbivore species in Gorongosa National Park, which are derived from aerial count data (see Walker et al. 2023).
Description of the data and file structure
Raw Diet Data Files
Relevant Files: Herbivore_Diet_Data_LateWet_2018_Raw.fastq.gz; Herbivore_Diet_Data_LateWet_2019_Raw.fastq.gz; Herbivore_Diet_Data_EarlyDry_2019_Raw.fastq.gz; Herbivore_Diet_Data_LateDry_2019_Raw.fastq.gz; African_Wild_Dog_Diet_Data_Raw.fastq.gz
File Structure: Files are compressed FASTQ files that can be expanded using the 'gunzip' command. These FASTQ files store raw DNA sequence data for each fecal-sample collection period (herbivore diets) and a single combined sequence dataset for African wild dog (AWD) diet data. These files contain unfiltered DNA sequences of the P6 loop of the trnL (UAA) intron (herbivore diets) and the 16S region (AWD diets). In all cases, pair-end sequences were first aligned using the OBITools command "illuminapairedend" (arguments: --sanger --score-min=40 -r) and fecal sample identities were associated with sequences using the "ngsfilter" command in the same program (arguments: -e 2). No other processing or filtering has been done. Each sequence record contains the DNA sequence, its length, dual-index tag identities, primers used, quality and matching scores, and the fecal sample the sequence was retrieved from. All downstreaming processing steps for these raw data files are described in the Methods section of Walker et al. 2023.
Filtered Herbivore Diet Data Files
Relevant Files: Herbivore_Diet_Data_EarlyDry_2016_Filtered.txt; Herbivore_Diet_Data_LateWet_2018_Filtered.txt; Herbivore_Diet_Data_EarlyDry_2018_Filtered.txt; Herbivore_Diet_Data_LateDry_2018_Filtered.txt; Herbivore_Diet_Data_LateWet_2019_Filtered.txt; Herbivore_Diet_Data_EarlyDry_2019_Filtered.txt; Herbivore_Diet_Data_LateDry_2019_Filtered.txt
File Structure: Filtered herbivore diet data tables are tab-delimited files describing both the composition of herbivore diets in each period and the best matches between the retrieved sequences and DNA reference databases (described in Methods). The two facets of the data table can be separated into separate objects in the R statistical software using the ROBITools package and its "import.metabarcoding.data" function to read the data tables. One of these tables (i.e., the 'reads' slot in S4 class R parlance) has fecal samples in the rows and molecular operational taxonomic units (mOTUs; the DNA sequences that passed bioinformatic filtering steps) in the columns. mOTUs are referred to by their sequence 'id' in the column names. Each element in this table reports the proportional contribution (i.e., relative read abundance) of each mOTU in the plant DNA extracted from each fecal sample. Rows sum to 1. The second table (i.e., the 'motus' slot in the data object) describes each mOTU found in the set of fecal samples from which DNA was extracted. This table has mOTUs in the rows (identified and matched with the @reads table via the 'id' column) and columns reporting the degree of matching between each mOTU and reference databases, prevalence in the dataset, and bioinformatic filtering scores. Column names containing capitalized three-letter codes (i.e., 'PNG', 'GNP', 'SER', 'MRC') report the results of matching each mOTU to a specific DNA reference library. PNG refers to the Gorongosa National Park reference library; SER refers to Serengeti National Park; MRC refers to Mpala Research Center, Kenya; GDB refers a global database based on the European Molecular Biology Laboratory database. Columns report the following data for each mOTU (* indicates more than one column that the description applies to):
id: sequence identity, matches column names in @reads table
best_identity*: the best proportional match between the mOTU and a reference sequence
best_match*: the identifier for the reference sequence best matching the mOTU
count: the number of times the mOTU occurred in the sequencing dataset
scientific_name*: the most specific taxonomy identity attributable to the mOTU based on matching reference sequences
species_list*: the species records from the reference set that best match the mOTU
sequence: the mOTU's nucleotide sequence
db_ok: the reference database (PNG, GDB, MRC, SER) that an mOTU was preferentially matched to
bid_ok: the best proportional match between the mOTU and a sequence in the preferred database
*_ok: hierarchical taxonomic identifiers for each mOTU based on matching to the preferred database
Herbivore Sample Identity Testing Files
Relevant Files: Herbivore_Dung_Samples_Unidentified.csv; Heribivore_Dung_Samples_Positively_Identified.csv
File Structure: Samples that were collected in the field but unable to be assigned with high confidence to one species are listed in the first file (Herbivore_Dung_Samples_Unidenitifed) under the column 'SampleID', which is the only column in the comma-delimited file. These samples are present in the "Raw" data files but were removed from the "Filtered" data files and are not included in Walker et al. (2023) analyses. The second file (Herbivore_Dung_Samples_Positively_Identified) is a two-column, comma-delimited file reporting the original sample name ('SampleID' column) and the large-herbivore species that sample was identified to originate from ('TopSeqSpeciesID' column). Names of samples correctly identified in the field did not change; names of those incorrectly identified in the field were corrected to the applicable herbivore-species code (see below) for analyses and inclusion in the 'Filtered' data files.
Filtered African Wild Dog Diet Data Files
Relevant Files: African_Wild_Dog_Diet_Data_Filtered.csv
File Structure: Comma-delimited file contains 42 rows (wild dog fecal sampling events) and 18 columns (reporting the composition of those fecal samples and when they were collected). The first column ('Date') reports time to/since Cyclone Idai landfall when the fecal-sampling event occurred. The remaining columns (Tragelaphus.sylvaticus to Loxodonta.africana) report the proportion of reads from each fecal-sampling event attributed to each prey species.
Metadata Files
Relevant Files: Herbivore_Floodplain_Use.csv; Kingdon_bodymass.csv; six files starting with "adonis_pairwise"
File Structure: Comma-delimited files. Herbivore_Floodplain_Use.csv contains three columns: COUNT_YEARS, SPECIES, PROP_IN_FLOODPLAIN. The first column reports the aerial surveys of large herbivores in Gorongosa National Park that these data derive from. The second reports the species name associated with each floodplain-use value. The third reports the proportion of individuals of each species that were observed in the floodplain habitat of Gorongosa National Park. Kingdon_bodymass.csv contains two columns: Species, Kingdon_bodymass. The first reports the species name; the second reports its mean adult body mass in kilograms per Kingdon's Mammals of Africa. Files beginning with the "adonis_pairwise" prefix are pre-processed data files containing results of the R function adonis2 (package vegan) run on each pair of species in each time period (one time period per file). Because each species has a differing number of samples at each time point, we randomly selected samples, ran adonis2, and repeated this procedure 1000 times for each pair of species in each collection period. As such, each collection-period specific file contains the following columns: 'CP' = collection period identity; 'Iteration' = permutation number for that species pair (1:1000); 'Species1' and 'Species2' = the species pair; 'nSamples1' and 'nSamples2' = number of samples from each species used in that permutation of the analysis; 'F' = F-statistic as returned by vegan::adonis2; 'R2' = R-squared statistic returned by vegan::adonis2; 'P' = p-value returned by vegan::adonis2; 'adjP' = repeated-tests adjusted p-value, adjusted using Holm method.
Compressed Directories
Relevant Files: png_shp.zip; Antelope_MCP.zip; New_Flood.zip
Directory Structure: Each directory contains geospatial files necessary to analyze Normalized Difference Vegetation Index for Gorongosa at the two scales tested in the study (i.e., within the network of flood sensors and within the maximum-convex polygon occupied by GPS-collared antelope). The analysis script is also provided (NDVI_Analysis.r) and these files are used by that script.
Analysis Scripts
Relevant Files: NDVI_analysis.r; GrassComposition_analysis.r; NicheOverlap_analysis.r; Help_functions.r; DietQuality_analysis.r; DietDiversity_analysis.r
File Structure: Each file is an R script and includes data processing and analysis code used to produce results presented in the manuscript. Code is commented. For NDVI_Analysis.r, users need to download the Gorongosa shapefile from the World Database of Protected Areas and MODIS data with the MODIStsp package in R. DietDiversity_analysis.r requires users to read in the provided metadata files on floodplain use and species' mean body mass. NicheOverlap_analysis.r requires users to read in provided pre-analyzed data files (each beginning with 'adonis_pairwise...'), code used to produce these files is provided within this script. DietQuality_analysis.r requires users to download data on plant nutritional traits from https://doi.org/10.5061/dryad.4f4qrfjdk, files required from this repository are listed in the script. The HelpFunctions.r script provides user-defined R functions that facilitate data processing.
Key to Species Naming Conventions in Diet Data Files
LYPI = African Wild Dog (Lycaon pictus)
KOEL = Waterbuck (Kobus ellipsiprymnus)
AEME = Impala (Aepyceros melampus)
OUOU = Oribi (Ourebia ourebi)
REAR = Reedbuck (Redunca arundinum)
HINI = Sable antelope (Hippotragus niger)
SYCA = Cape buffalo (Syncerus caffer)
ALBU = Hartebeest (Alcephalus buselaphus)
COTA = Blue wildebeest (Connochaetes taurinus)
LOAF = Savanna elephant (Loxodonta africana)
TRST = Kudu (Tragelaphus strepsiceros)
TRAN = Nyala (Tragelaphus angasii)
TRSC = Bushbuck (Tragelaphus scriptus/sylvaticus)
PHAF = Warthog (Phacochoerus africanus)
Sharing/Access information
Analyses in the manuscript also use data quantifying plant nutrition. Those data and the matching between mOTUs and field-measured plant traits can be found here: https://doi.org/10.5061/dryad.4f4qrfjdk. Raw herbivore diet data from the early-dry season in 2016, early-dry season in 2018, and late-dry season in 2018 can be found in these two repositories (https://doi.org/10.5061/dryad.brv15dvcj; https://doi.org/10.5061/dryad.sxksn02zc).
Methods
This archive presents raw and filtered data on the diets of 13 species of large mammalian herbivores and African wild dogs (Lycaon pictus) in Gorongosa National Park. Data were generated via DNA metabarcoding of fecal samples in three distinct seasons for large herbivores (late-wet, early-dry, and late-dry) and opportunistically for wild dogs. The methods summary below is from Walker et al. (2023); please see that paper for additional details, references, and context. All filtered datasets used in Walker et al. 2023 are deposited here alongside raw DNA metabarcoding data for African wild dog diets and large-herbivore diets in late-wet season 2018 and all seasons in 2019. Raw data for large-herbivore diet datasets also used in Walker et al. (2023) can be found in the following repositories: https://doi.org/10.5061/dryad.sxksn02zc (2018, early- and late-dry season); https://doi.org/10.5061/dryad.brv15dvcj (2016, early-dry season).
DNA metabarcoding analysis of wild dog diets. Samples were collected on an opportunistic basis with unused nitrile gloves and placed in an unused plastic bag. Sampling date and location were recorded on the bag, which was placed on ice for transport to the field laboratory. There, samples were frozen at -20° C until processing. Every three months, all samples collected from the intervening period were thawed and preprocessed for DNA extraction. Sample preprocessing involved homogenizing each thawed sample by massaging the bag between thumb and forefinger. A pea-sized subsample was then transferred into a plastic tube containing silica bashing beads and 750uL of buffer (Xpedition Lysis/Stabilization Solution; Zymo Research, CA, USA). We then capped the tube and vortexed it for 30 s to break apart the subsample, exposing it to the buffer. Samples were heat treated at 72° C for 30 min as an antiviral precaution (a step mandated by the US Department of Agriculture for most herbivore species, described below) before being refrozen at -20° C for transport to Princeton University.
In a Biosafety Level 2 facility at Princeton University dedicated to fecal DNA analysis, we extracted DNA from the samples using Zymo Quick-DNA Soil/Fecal Microbe MiniPrep kits
(Zymo Research, CA, USA) following manufacturer instructions. DNA extractions comprised batches of 29 samples (wild dog samples were extracted as part of a broader extraction session involving other species) and one negative extraction control (750uL DNA lysis buffer). Extracted DNA was compiled onto two 96-well plates for amplification. Before proceeding with DNA amplification, we first standardized DNA concentrations across samples using the Quant-iT Picogreen assay (ThermoFisher Scientific) to quantify DNA concentrations and adding nucleic-acid-free water (Qiagen, MD) to dilute high-concentration samples. We amplified the mitochondrial 16S gene to amplify mammal DNA using an established primer pair (MamP007F, 5’-CGAGAAGACCCTATGGAGCT-3’; MamP007R, 5’-CCGAGGTCRCCCCAACC-3’) 1. To multiplex PCR products, we tagged the forward and reverse primers with 8-nt tags that each differed by at least 4 nt to distinguish samples after sequencing. To limit amplification of wild dog DNA, we used a blocking primer (5’-GGAGCTTTAATTAACTAACCCAAGCTTACGG-3’) with a C3-spacer added to the 3’ end and a six-base overlap with MamP007F (first six bases of primer sequence, underlined). PCRs contained 2μL of extracted DNA, 0.5U of AmpliTaq Gold (Applied Biosystems, MA), 0.2μg BSA (New England Biolabs, MA), 0.2μM of each primer, 0.2mM of each dNTP (New England Biolabs, MA), 2mM MgCl2 (Applied Biosystems, MA), 1X GeneAmp PCR Buffer II (Applied Biosystems, MA), and 2μM of the blocking primer, with a final volume of 20μL. Thermocycling conditions were: initial denaturing (95° C, 10 min), 45 cycles of denaturing (95° C, 30 s), annealing (52° C, 30 s), elongation (72° C, 30 s), and a final extension (72° C, 7 min). All PCRs were performed in triplicate and included negative extraction controls, negative PCR controls, and positive PCR controls (where DNA template was provided as a sequence designed in silico to comprise: 5’-10 random bases, MamP007F, a 75-base random barcode, MamP007R, 10 random bases-3’). We confirmed successful amplification of the barcode using gel electrophoresis. We pooled PCR products by plate and purified them with a MinElute PCR Purification kit (Qiagen, MD). Purified PCR products from each plate were submitted for sequencing as equimolar libraries to the Lewis-Sigler Institute for Integrative Genomics at Princeton University, where Illumina tags were appended with a low-cycle PCR approach and libraries were sequenced in paired-end (2×150bp) on a NovaSeq SP 300-nt.
Forward and reverse reads were paired using OBITools’ illuminapairedend command (minimum score = 40) 2. Sequences were then assigned to the samples they came from (ngsfilter; up to two errors allowed), while sequences that were unaligned, contained ambiguous bases, or were outside the expected barcode length (< 40 or > 140 bp) were removed. Identical sequences were aggregated (obiuniq) and any sequences with only one read in the dataset (i.e., singletons) were removed (obigrep -p ‘count>1’). Taxonomic identifiers were assigned to each sequence using the ecotag command and a 16S reference database created using in silico PCR and the EMBL taxonomic library (release 143). Only sequences belonging to Mammalia (taxonomic ID: 40674) were retained in the dataset. We removed putative chimeras (sequences with < 80% match to reference database) and PCR errors, which were identified using the obiclean command (parameters: -d 1; -r 0.25). Sequences with highest abundance in controls were considered contaminants and removed. PCR replicates were removed if their read depth fell below a critical threshold (1000 reads), or if they were above the 95th quantile for contaminant read abundance.
Consistency among PCR replicates was confirmed by comparing their composition. If inter-replicate dissimilarity was greater than the 95th quantile of inter-replicate dissimilarity, or if an inter-replicate comparison fell within the distribution of inter-sample dissimilarity, it was removed. When two or more PCR replicates remained in the dataset, their composition was averaged to give the mean scat-level sequence composition. Finally, we limited the data to sequences that had a perfect match to the reference database and removed sequences accounting for < 1% of each sample’s relative read abundance (RRA) to reduce the likelihood of false positives 3,4. Secondary predation (i.e., detecting prey of prey) is unlikely in this dataset because wild dog prey were almost exclusively plant-eating ungulates.
DNA metabarcoding analysis of herbivore diets. Fecal samples were collected by driving Gorongosa’s road network and collecting samples from defecating ungulates. Additional samples were collected from animals immobilized for GPS-collaring 5-8. For each sample, we recorded the GPS coordinates of the defecation site and a classification of the surrounding habitat type. Other field handling and preprocessing protocols were as described above for wild dogs. Samples were subjected to an antiviral treatment (72°C for 30 minutes) before import into the United States, as mandated by the US Department of Agriculture (USDA/AHPIS/VS permit #130123 to R.M.P.).
On arrival at Princeton University, we extracted DNA from each sample as described above for wild dogs (batches of 29 samples, one extraction control containing lysis buffer only) using Zymo Quick-DNA Fecal/Soil Microbe Mini Prep kits (Zymo Research, CA, USA). DNA extraction, amplification, and sequencing was done in batches corresponding to each sampling period. Extracted DNA from samples was organized into 96-well PCR plates. To limit large differences in post-PCR DNA concentrations, we quantified and standardized the concentration of DNA in samples collected in 2018 and 2019 using Quant-iT Picogreen (ThermoFisher Scientific). Samples with high concentrations were diluted using nucleic-acid-free water. In triplicate for each sample, we amplified the P6 loop of the chloroplast trnL(UAA) intron (and the g and h primers designed for it), a widely used barcode for degraded plant DNA, using primers with a unique 8-nt tag at the 5’ end that enabled pooling of uniquely identifiable PCR products for sequencing in a single high-throughput run 9. PCRs included negative controls (nucleic-acid-free water), negative extraction controls, and positive controls (2016 only). PCR mixtures comprised 2 μl of DNA extract, 0.5U AmpliTaq Gold DNA Polymerase, 0.17 mg ml −1 of BSA, 0.2 μM of each primer, 0.2 mM of each dNTP, 2.5 mM of MgCl2 , 0.4% dimethyl sulfoxide, 1XGeneAmp PCR buffer II in a final reaction volume of 20 μL. Thermocycling conditions were: initial denaturing (95° C for 10 min); 35 cycles of denaturing (95° C, 30 seconds), annealing (55° C, 30 seconds), and extension (72° C, 30 seconds); final elongation (72° C, 2 min). We confirmed successful amplification with gel electrophoresis, and PCR products were pooled by plate and purified using MinElute PCR Purification kits. Purified PCR products were then pooled into sequencing libraries, ensuring equimolarity. Sequencing libraries were prepared with a PCR-free approach (2016) or low-cycle PCR (2018, 2019) by the Lewis Sigler Institute for Integrative Genomics at Princeton University, and paired-end sequencing (2×150bp) was performed on Illumina MiSeq and HiSeq 2500 platforms.
We processed sequence data for each collection period using the OBITools package 2. These procedures were similar as described above for wild dogs. Briefly, we assembled paired-end reads (illuminapairedend; arguments: --sanger –score-min=40), assigned reads to the samples they came from (ngsfilter; arguments: -e 2), and merged unique sequences (obiuniq). We discarded sequences with a low alignment score, ambiguous nucleotides, unexpected barcode length (< 8 or > 180 nt), and < 10 reads across the dataset. We then aligned the remaining sequences to identify sequence variants (obiclean; arguments: -d 1 -r 0.25) and removed sequences that were identified more often as variants of dominant sequences rather than dominant sequences themselves or singular sequences (without variants present). We assigned taxonomic identities to the remaining sequences with primary reference a local plant DNA database based on vouchered specimens collected in Gorongosa 10. When the local-library assignment score was < 98%, we used secondary comparisons to a global database compiled from the European Molecular Biology Laboratory (release 143) and local databases of plants from Serengeti National Park, Tanzania, and Laikipia, Kenya 11 for taxonomic identification. Sequence variants identified by obiclean were retained in the dataset if they perfectly matched a reference
sequence in the local Gorongosa reference database. Finally, we removed putative contaminants (sequences most abundant in negative controls) and chimeras (sequences that had a < 80% match in the reference databases). Last, PCR replicates were retained if they were within the 95th quantile of inter-replicate dissimilarity, did not fall within the distribution of inter-sample dissimilarities, and were below the 95th quantile of putative contaminant and chimera abundance. Samples were retained if they had > 1 replicate remaining after this procedure. We then averaged the number of reads across all retained PCR replicates for each sample and removed molecular operational taxonomic units (mOTUs) accounting for < 1% of reads per sample, following the approach that we have used in previous studies from this system 10,12. While there are multiple approaches to dealing with low-abundance reads in metabarcoding, per-sample proportional abundance thresholds are one of the simplest and most effective 4. We tested the sensitivity of our results to removing this threshold and found that while this increased the number of unique sequences by two orders of magnitude (implausibly relative to the estimated number of plant taxa in our study area), it did not qualitatively alter our results for the abundance-weighted dietary metrics that we analyze. Finally, we rarified sample read depth to 1,250 reads to facilitate comparisons among samples and converted mOTU-by-sample matrices into proportional abundance (relative read abundance, RRA) of each plant mOTU per sample. We use RRA data for inference given that RRA of the trnL-P6 marker (i) provides a reasonable approximation of proportional consumption 13-15, (ii) has yielded inferences in our previous studies that qualitatively match those based on presence-absence data 10,12,15, and (iii) provides a fuller and more accurate characterization of population-level diets than presence-absence data 16,17.
Confirming species identities for a subset of herbivore fecal samples. For a subset of fecal samples, our field notes recorded uncertainties in the herbivore species’ identity. Typically this arose from unusual dung morphology or difficulty locating a dung pile after directly observing defecation. For this subset of samples, we amplified the 16S mitochondrial barcode (MamP007F, MamP007R primers) used to analyze wild dog diets to identify the herbivore species of the sample. As described above, the primers were tagged on the 5’ end with 8-nt identifiers that each differed by at least 4-nt. We used the same PCR mixture and thermocycling conditions reported above for wild dog diet analysis. After confirming amplification with gel electrophoresis, we pooled and purified the PCR products using the MinElute PCR Purification kit. Purified DNA was combined in equimolar proportions into two sequencing libraries and submitted to the Lewis Sigler Institute for Integrative Genomics, where Illumina tags were appended with a low-cycle PCR approach and libraries were sequenced in paired-end (2×150bp) on a NovaSeq SP 300 nt.
As for diet datasets, sequences were filtered using OBITools following the same procedure outlined above (paired-end reads assembled with illuminapairedend, sequences assigned to samples with ngsfilter, sequences dereplicated with obiuniq) 2. As for the wild dog diet filtering, sequences outside the expected 40–140 bp range, with low alignment scores, or with ambiguous nucleotides were discarded. Sequence variants were identified with obiclean and only dominant or singular variants were retained. Taxonomic identifiers were associated with sequences using the global reference database for mammals that we compiled with in silico PCR from the European Molecular Biology Laboratory release 143. Following ref. 10, we used five criteria to assign identities to the remaining samples: (1) the sample had > 350 mammalian reads, (2) the most common sequence accounted for ≥50% reads, (3) most common sequence matched a reference database sequence with ≥98%, (4) only one Latin binomial name was associated with that most common sequence, (5) that most common sequence was at least twice as common as the second most common. Samples were discarded if they failed two of these criteria; for the remainder, the most common sequence was assigned as the sample identity.
Additional Files and Scripts The archive also includes data-analysis scripts to reproduce results for diet analyses and NDVI analyses (including shapefiles used in the NDVI-analysis script.
References
- Giguet-Covex, C., J. et al. Long livestock farming history and human landscape shaping revealed by lake sediment DNA. Nature Comm. 5, 3211 (2014).
- Boyer, F., et al. OBITOOLS: a UNIX-inspired software package for DNA metabarcoding, Mol. Ecol. Resour. 16, 176-182 (2016).
- Pansu, J., et.al., Generality of cryptic dietary niche differentiation in diverse large-herbivore assemblages. PNAS 119, e2204400119 (2022).
- Drake, L. E., et al. An assessment of minimum sequence copy thresholds for identifying and reducing the prevalence of artefacts in dietary metabarcoding data. Methods in Ecol. & Evol. 13, 694-710 (2022).
- Daskin, J. H., et al. Allometry of behavior and niche differentiation among congeneric African antelopes. Ecol. Monogr., e1549 (2022).
- Atkins, J. L., et al. Cascading impacts of large-carnivore extirpation in an African ecosystem. Science 364, 173-177 (2019).
- Branco, P. S., et al. Determinants of elephant foraging behavior in a coupled human-natural system: is brown the new green? J. An. Ecol. 88, 780-792 (2019).
- Walker, R. H., et al. Mechanisms of individual variation in large herbivore diets: roles of spatial heterogeneity and state-dependent foraging. Ecology 104, e3921 (2023).
- Taberlet, P., et al. Power and limitation of the chloroplast trnL (UAA) intron for plant DNA barcoding. Nucleic Acids Research 35, e14 (2007).
- Pansu, J., et al. Trophic ecology of large herbivores in a reassembling African ecosystem, J. Ecol. 109, 1355-1376 (2019).
- Gill, B. A., et al. Plant DNA-barcode library and community phylogeny for a semi-arid Eastern African savanna, Mol. Ecol. Resour. 19, 838-846 (2019).
- Guyton, J. A., et al. Trophic rewilding revives biotic resistance to shrub invasion, Nat. Ecol. Evol. 4, 712-724 (2020).
- Willersley, E., et al. Fifty thousand years of Arctic vegetation and megafaunal diet. Nature 506, 47-15 (2014).
- Craine, J. M., Towne, E.G., Miller, M., & Fierer, N. Climatic warming and the future of bison as grazers. Sci. Rep. 5, 16738 (2015).
- Kartzinel, T. R., et al. DNA metabarcoding illuminates dietary niche partitioning by African large herbivores, PNAS 112, 8019-8024 (2015).
- Deagle, B. E., et al. Counting with DNA in metabarcoding studies: How should we convert sequence reads to dietary data? Mol. Ecol. 28, 391-406 (2019).
- Littleford-Colquhoun, B. L., et al. The precautionary principle and dietary DNA metabarcoding: commonly used abundance thresholds change ecological interpretation. Mol. Ecol. 31, 1615-1626 (2022).