Determinants of microbiome composition: Insights from free-ranging hybrid zebras (Equus quagga × grevyi)
Data files
Feb 09, 2024 version files 9.15 GB
-
Bacterial_phylogeny.nwk
99.72 KB
-
Bacterial_taxonomy.csv
964.36 KB
-
Diet_composition_data.txt
101.29 KB
-
Hybrid_zebra_diet_forward_reads.fastq
1.79 GB
-
Hybrid_zebra_diet_reverse_reads.fastq
1.79 GB
-
Hybrid_zebra_microbiome_forward_reads.fastq
2.79 GB
-
Hybrid_zebra_microbiome_reverse_reads.fastq
2.79 GB
-
Microbiome_composition_data.csv
590.57 KB
-
README.md
9.79 KB
-
Sample_metadata.txt
6.99 KB
Abstract
The composition of mammalian gut microbiomes is highly conserved within species, yet the mechanisms by which microbiome composition is transmitted and maintained within lineages of wild animals remain unclear. Mutually compatible hypotheses exist, including that microbiome fidelity results from inherited dietary habits, shared environmental exposure, morphophysiological filtering, and/or maternal effects. Interspecific hybrids are a promising system in which to interrogate the determinants of microbiome composition because hybrids can decouple traits and processes that are otherwise co-inherited in their parent species. We used a population of free-living hybrid zebras (Equus quagga × grevyi) in Kenya to evaluate the roles of these four mechanisms in regulating microbiome composition. We analyzed fecal DNA for both the trnL-P6 and the 16S rRNA V4 region to characterize the diets and microbiomes of the hybrid zebra and of their parent species, plains zebra (E. quagga) and Grevy’s zebra (E. grevyi). We found that both diet and microbiome composition clustered by species, and that hybrid diets and microbiomes were largely nested within those of the maternal species, plains zebra. Hybrid microbiomes were less variable than those of either parent species where they co-occurred. Diet and microbiome composition were strongly correlated, although the strength of this correlation varied between species. These patterns are most consistent with the maternal-effects hypothesis, somewhat consistent with the diet hypothesis, and largely inconsistent with the environmental-sourcing and morphophysiological-filtering hypotheses. Maternal transmittance likely operates in conjunction with inherited feeding habits to conserve microbiome composition within species.
README
This 'README_file_Hybrid_zebra_microbiomes.txt' file was generated on 2024-01-22 by JOEL O. ABRAHAM
GENERAL INFORMATION
Title of Dataset: Determinants of microbiome composition: insights from free-ranging hybrid zebras (Equus quagga × grevyi)
Author Information
A. Principal Investigator Contact Information
Name: Joel O. Abraham
Institution: Princeton University
Email: joeloa@princeton.eduB. Associate or Co-investigator Contact Information
Name: Daniel I. Rubenstein
Institution: Princeton University
Email: dir@princeton.eduDate of data collection (approximate): 2020-01-08 to 2020-01-17
Geographic location of data collection: Laikipia County, Kenya
Information about funding sources that supported the collection of the data:
NSF DEB-2225088 and the High Meadows Environmental Institute at Princeton University.
Joel O. Abraham is supported by the NSF Graduate Research Fellowship (Fellow ID: 2019256075).
SHARING/ACCESS INFORMATION
Licenses/restrictions placed on the data: None.
Links to publications that cite or use the data: TBD
Links to other publicly accessible locations of the data: TBD
Links/relationships to ancillary data sets: NA
Was data derived from another source? NO
Recommended citation for this dataset:
Abraham, J.O., Lin, B., Miller, A.E., Henry, L.P., Demmel, M.Y., Warungu, R., Mwangi, M., Lobura, P.M., Pallares, L.F., Ayroles, J.F, Pringle, R.M, Rubenstein, D.I. (2024). Determinants of microbiome composition: insights from free-ranging hybrid zebras (Equus quagga × grevyi). Dryad, Dataset,
DATA & FILE OVERVIEW
- File List: README_file_Hybird_zebra_microbiomes.txt: README file explaining how the dataset was generated and the data contained in the dataset Sample_metadata.text: File containing the sample metadata Hybrid_zebra_diet_forward_reads.fastq: Raw sequence data (forward reads for diet data) Hybrid_zebra_diet_reverse_reads.fastq: Raw sequence data (reverse reads for diet data) Hybrid_zebra_microbiome_forward_reads.fastq: Raw sequence data (forward reads for microbiome data) Hybrid_zebra_microbiome_reverse_reads.fastq: Raw sequence data (reverse reads for microbiome data) Diet_composition_data.txt: Relative read abundances of plant mOTUs in zebra fecal samples (post-filtering) Microbiome_composition_data.csv: Rarefied abundances of bacterial ASVs in zebra fecal samples (post-filtering) Bacterial_taxonomy.csv: Taxonomic information for all bacterial ASVs in zebra microbiomes Bacterial_phylogeny.nwk: Phylogenetic tree of bacterial ASVs in zebra microbiomes Hybrid zebra microbiome analyses.R: R script for analyzing data and generating visualizations.
METHODOLOGICAL INFORMATION
Description of methods used for collection/generation of data:
Fecal samples from free-ranging zebra were collected from Laikipia County in January 2020. DNA was extracted from fecal samples and the plant and bacterial components were sequenced to characterize diet and microbiome composition respectively.Methods for processing the data:
Diet sequence data were curated using the OBITOOLS v2 package, while microbiome sequence data were processed using the DADA2 v1.18 big data pipeline, implemented in R v4.0.2.Instrument- or software-specific information needed to interpret the data:
R is necessary to run the R script file. The R script was written in R v4.0.2.Standards and calibration information, if appropriate:
NAEnvironmental/experimental conditions:
NADescribe any quality-assurance procedures performed on the data:
NAPeople involved with sample collection, processing, analysis and/or submission:
Joel O. Abraham, Bing Lin, Audrey E. Miller, Lucas P. Henry, Margaret Y. Demmel, Rosemary Warungu, Margaret Mwangi, Patrick M. Lobura, Luisa F. Pallares, Julien F. Ayroles, Robert M. Pringle, Daniel I. Rubenstein
AEM and BL conceived of the project and designed the sampling approach, with input from DIR and JFA.
AEM and BL led fecal sampling in the field, with help from RW, MM, and PML to find and identify hybrids.
JOA, AEM, and BL performed lab work, with technical advice from LFP, LPH, and MYD.
Lab work was conducted at the Mpala genomics facility in Kenya and at RMP’s laboratory at Princeton University.
JOA led bioinformatics and data analyses, with input from LPH and MYD.
DATA-SPECIFIC INFORMATION FOR: Sample_metadata.text
Number of variables: 12
Number of cases/rows: 91
Variable List:
sample_number: A unique number assigned to each sample for indexing purposes
sample_ID: A unique ID assigned to each sample; corresponds to columns headers in 'Diet_composition_data.txt' and 'Microbiome_composition_data.csv'
location: Reserve on which a given sample was collected; 'OLP' corresponds to Ol Pejeta (main area of the park); 'OLP-R' corresponds to Ol Pejeta (fenced-off reserve); 'MPA' corresponds to Mpala; 'OLJ' corresponds to Ol Jogi
latitude: Latitude at which the sample was collected
longitude: Longitude at which the sample was collected
status: Whether the sample was collected from the same reserve where hybrid zebra occur ('SYM' for sympatric) or a different reserve ('ALL' for allopatric)
species: The zebra species from which the sample was collected ('PL' is plains zebra, 'GR' is Grevy's zebra, and 'HY' is hybrid zebra)
pop: The subpopulation from which the sample was collected ('sym_hyb' is Ol Pejeta hybrids, 'sym_dad' is Ol Pejeta Grevy's, 'sym_mom' is Ol Pejeta plains, 'all_dad' is Mpala/Ol Jogi Grevy's, 'all_mom' is Mpala/Ol Jogi plains)
sex: The sex of the zebra from which the sample was collected ('M' is male, 'F' is female)
age: The age category of the zebra from which the sample was collected ('AD' is adult, 'JU' is juvenile)
group_size: Size of the group that the zebra from which the sample was collected was observed in
date: The time each sample was collectedMissing data codes:
'NA'Specialized formats or other abbreviations used:
NA
DATA-SPECIFIC INFORMATION FOR: Diet_composition_data.txt
Number of variables: 19
Number of cases/rows: 144
Variable List:
id: Unique ID assigned to each plant mOTU
best_identity_MRC: Percent sequence match with best match in local reference library
best_identity_GDB: Percent sequence match with best match in global reference library
best_match_MRC: Barcode reference ID in the local reference library that best matches the mOTU sequence
best_match_GDB: Barcode reference ID in the global reference library that best matches the mOTU sequence
kingdom_name_ok: The kingdom assigned to the plant mOTU by comparing to the two reference libraries (always Viridiplantae)
phylum_name_ok: The phylum assigned to the plant mOTU by comparing to the two reference libraries (always Streptophyta)
class_name_ok: The class assigned to the plant mOTU by comparing to the two reference libraries
order_name_ok: The order assigned to the plant mOTU by comparing to the two reference libraries (if available)
family_name_ok: The family assigned to the plant mOTU by comparing to the two reference libraries (if available)
genus_name_ok: The genus assigned to the plant mOTU by comparing to the two reference libraries (if available)
species_name_ok: The species name assigned to the plant mOTU by comparing to the two reference libraries (if available)
scientific_name rank: The finest taxonomic rank available for each plant mOTU
species_list_MRC: The list of species matched to each plant mOTU from the local reference library
species_list_GDB: The list of species matched to each plant mOTU from the global reference library
sci_name_ok: The finest resolution taxonomic categorization of each plant mOTU
sci_name_plot: The taxonomic categorization of each plant mOTU used for visualization purposes
sequence: The DNA sequence corresponding to each plant mOTU
Occurence: The number of samples each plant mOTU occurred in (note that samples were sequenced in triplicate so occurence values are inflated by a factor of three)
Note: columns 20-105 are the relative read abundances of each plant mOTU in each sampleMissing data codes:
'NA'Specialized formats or other abbreviations used:
NA
DATA-SPECIFIC INFORMATION FOR: Microbiome_composition_data.csv
Number of variables: 2892
Number of cases/rows: 84
Variable List:
Each contains the read count (averaged across all three replicates) of a particular bacterial ASV in a given sample (post-rarefaction); each row is a sample, each column is a bacterial ASV.
The taxonomic information for each bacterial ASV can be found in the file 'Bacterial_taxonomy.csv'Missing data codes:
'NA'Specialized formats or other abbreviations used:
NA
DATA-SPECIFIC INFORMATION FOR: Bacterial_taxonomy.csv
Number of variables: 19
Number of cases/rows: 2892
Variable List:
Kingdom: The kingdom assigned to the bacterial ASV (always Bacteria)
Phylum: The phylum assigned to the bacterial ASV (if available)
Class: The class assigned to the bacterial ASV (if available)
Order: The order assigned to the bacterial ASV (if available)
Family: The family assigned to the bacterial ASV (if available)
Genus: The genus assigned to the bacterial ASV (if available)
Species: The species name assigned to the bacterial ASV (if available)
sequence: The DNA sequence corresponding to each bacterial ASVMissing data codes:
'NA'Specialized formats or other abbreviations used:
NA
Methods
Fecal samples from free-ranging zebra were collected from Laikipia County in January 2020. DNA was extracted from fecal samples and the plant and bacterial components were sequenced to characterize diet and microbiome composition respectively. Diet sequence data were curated using the OBITOOLS v2 package, while microbiome sequence data were processed using the DADA2 v1.18 big data pipeline, implemented in R v4.0.2.