Skip to main content
Dryad

Phylogenies related to: Codiversification of gut microbiota with humans

Cite this dataset

Suzuki, Taichi; Fitzstevens, Liam; Youngblut, Nicholas; Ley, Ruth (2022). Phylogenies related to: Codiversification of gut microbiota with humans [Dataset]. Dryad. https://doi.org/10.5061/dryad.qrfj6q5k2

Abstract

The gut microbiomes of human populations worldwide have many core microbial species in common. However, within a species, some strains can show remarkable population specificity. The question is whether such specificity arises from a shared evolutionary history (codiversification) between humans and their microbes. To test for codiversification of host and microbiota, we analyzed paired gut metagenomes and human genomes for 1225 individuals in Europe, Asia, and Africa, including mothers and their children.  Between and within countries, signals of codiversification were evident for humans and their gut microbes. Moreover, species displaying the strongest codiversification independently evolved traits characteristic of host dependency, including reduced genomes, and oxygen and temperature sensitivity. These findings all point to the importance of understanding the potential role of population-specific microbial strains in microbiome-mediated disease phenotypes.

Methods

Microbial phylogenies: We created microbial phylogenies from adult and child metagenomes using StrainPhlAn v3.0. First, we picked the top 100 prevalent taxa in metagenomic samples of adults (n=839) and children (n=386) using MetaPhlAn3 (v3.0.1) using the following parameters (--tax_lev a --min_cu_len 2000). The taxonomy is based on NCBI. Next, we used StrainPhlan3 to generate microbial phylogenies using the following parameters: samples2markers.py (--breadth_threshold 80) and strainphlan.py (--phylophlan_mode accurate --marker_in_n_samples 10 --sample_with_n_markers 10). To allow for between-population comparisons, we selected taxa in adults that represent a total of ≥ 100 individuals, ≥ 10 individuals each from major human grouping; Africa (Cameroon and Gabon), Asia (Korea and Vietnam), and Europe (Germany and UK) and ≥ 1 individual per country, which resulted in 59 taxa. For child samples, we picked the top 20 taxa that had the largest trees due to the smaller sample size. See paper for more details.

Human host phylogeny: We created 100 maximum likelihood trees in SNPhylo (version 20180901) including 839 individuals using 20,506 SNPs, which were further filtered to remove uninformative SNPs based on the following parameters (ld_threshold = 0.1, maf_threshold = 0.05, and missing_rate = 0.1). This resulted in around 9,000 SNPs to create the host phylogeny (each bootstrapped tree uses a slightly different number of SNPs). We picked the best tree out of the 100 trees based on the maximum likelihood score and plotted the bootstrap value.

Association file: Associating IDs in each file. 

Description of files:

- Maximum likelihood phylogenies of 59 microbial taxa in adults with bootstrap support: .tre file names start with "Adult_RAxML_bipartitions." followed by taxa name. 

- Alignment files for 20 microbial taxa in adults: .aln file names start with "Adult_alignment_" followed by taxa name.

- Maximum likelihood phylogenies of 20 microbial taxa in children with bootstrap support: .tre file names start with "Child_RAxML_bipartitions." followed by taxa name. 

- Alignment files for 20 microbial taxa in children: .aln file names start with "Child_alignment_" followed by taxa name.

- Maximum likelihood phylogenies of human hosts with bootstrap support: HostTree.tre

- Alignment file for the human host phylogeny: HostTree_alignment.fasta

- Association file: ID_table_n839.txt

Usage notes

Any tree-viewing software (iTOL, FigTree, etc.) would be able to open and view the tree files (.tre).

Any text-viewing software (BBEdit, etc.) would be able to open and view the alignment files (.aln and .fasta).

Funding

Max Planck Society