Skip to main content
Dryad

Heterogeneity in the rate of molecular sequence evolution substantially impacts the accuracy of detecting shifts in diversification rates

Cite this dataset

Shafir, Anat; Azouri, Dana; Goldberg, Emma E.; Mayrose, Itay (2020). Heterogeneity in the rate of molecular sequence evolution substantially impacts the accuracy of detecting shifts in diversification rates [Dataset]. Dryad. https://doi.org/10.5061/dryad.37pvmcvgp

Abstract

As species richness varies along the tree of life, there is a great interest in identifying factors that affect the rates by which lineages speciate or go extinct. To this end, theoretical biologists have developed a suit of phylogenetic comparative methods that aim to identify where shifts in diversification rates had occurred along a phylogeny and whether they are associated with some traits. Using these methods, numerous studies have predicted that speciation and extinction rates vary across the tree of life. In this study we show that asymmetric rates of sequence evolution rates lead to systematic biases in the inferred phylogeny, which in turn lead to erroneous inferences regarding lineage diversification patterns. The results demonstrate that as the asymmetry in sequence evolution rates increases, so does the tendency to select more complicated models that include the possibility of diversification rate shifts. These results thus suggest that any inference regarding shifts in diversification pattern should be treated with great caution, at least until any biases regarding the molecular substitution rate have been ruled out.

Methods

All data were created using simulations.

All the trees were simulated using the Diversitree package (https://www.zoology.ubc.ca/prog/diversitree/).

The multiple sequence alignments were simulated using INDELible (http://abacus.gene.ucl.ac.uk/software/indelible/).

The maximum likelihood trees were inferred using PhyML (http://www.atgc-montpellier.fr/phyml/).

The time-calibrated trees were achieved using MrBayes (http://nbisweden.github.io/MrBayes/), PATHd8 (https://www2.math.su.se/PATHd8/) and the penalized likelihood method implemented in r8s (https://sourceforge.net/projects/r8s/).

Trait-dependent diversification was tested using BiSSE (https://www.zoology.ubc.ca/prog/diversitree/) and HiSSE (https://CRAN.R-project.org/package=hisse).

Trait-independent shifts in diversification were inferred using BAMM (http://bamm-project.org/) and MEDUSA (https://github.com/josephwb/turboMEDUSA).

Usage notes

The compressed data is stored in the following files:

  • SimulationSet1_4.tar.gz
  • SimulationSet2.tar.gz
  • SimulationSet3.tar.gz
  • SimulationSet5a.tar.gz
  • SimulationSet5b.tar.gz
  • SimulationSetSeqLen_250.tar.gz
  • SimulationSetSeqLen_2000.tar.gz

The scripts can be found in the scripts.zip compressed folder. Readme.txt and DataInfo.txt files are attached for further explanation about the scripts and the data.

Funding

United States-Israel Binational Science Foundation, Award: 2013286

National Science Foundation, Award: DEB 1655478

National Science Foundation, Award: DEB 1940868

National Science Foundation, Award: BSF 1655478