Skip to main content
Dryad logo

A Likelihood-Ratio Test for Lumpability of Phylogenetic Data: Is the Markovian Property of an Evolutionary Process Retained in Recoded DNA?

Citation

Jermiin, Lars et al. (2022), A Likelihood-Ratio Test for Lumpability of Phylogenetic Data: Is the Markovian Property of an Evolutionary Process Retained in Recoded DNA?, Dryad, Dataset, https://doi.org/10.5061/dryad.mw6m905w9

Abstract

Abstract In molecular phylogenetics, it is typically assumed that the evolutionary process for DNA can be approximated by independent and identically distributed Markovian processes at the variable sites and that these processes diverge over the edges of a rooted bifurcating tree. Sometimes the nucleotides are transformed from a 4-state alphabet to a 3- or 2-state alphabet by a procedure that is called recoding, lumping, or grouping of states. Here, we introduce a likelihood-ratio test for lumpability for DNA that has diverged under different Markovian conditions, which assesses the assumption that the Markovian property of the evolutionary process over each edge is retained after recoding of the nucleotides. The test is derived and validated numerically on simulated data. To demonstrate the insights that can be gained by using the test, we assessed two published data sets, one of mitochondrial DNA from a phylogenetic study of the ratites and the other of nuclear DNA from a phylogenetic study of yeast. Our analysis of these data sets revealed that recoding of the DNA eliminated some of the compositional heterogeneity detected over the sequences. However, the Markovian property of the original evolutionary process was not retained by the recoding, leading to some significant distortions of edge lengths in reconstructed trees.[Evolutionary processes; likelihood-ratio test; lumpability; Markovian processes; Markov models; phylogeny; recoding of nucleotides.]

Methods

The data were generated as described in:

Phillips et al. (2010). Tinamous and moa flock together: mitochondrial genome sequence analysis reveals independent losses of flight among ratites. Syst. Biol. 59:90-107.

or obtained from:

Phillips et al. (2004). Genome-scale phylogeny and the detection of systematic biases. Mol. Biol. Evol. 21:1455-1458.

The data were subsequently used in:

Vera-Ruiz et al. (2021). A likelihood-ratio test for lumpability of phylogenetic data: Is the Markovian property of an evolutionary process retained in recoded DNA? Syst. Biol. (doi: 10.1093/sysbio/syab074).

Usage Notes

The data files are presented in the Nexus format and are identical to that obtained from Matthew J. Philips, first author of :

Phillips et al. (2010). Tinamous and moa flock together: mitochondrial genome sequence analysis reveals independent losses of flight among ratites. Syst. Biol. 59:90-107.

Phillips et al. (2004). Genome-scale phylogeny and the detection of systematic biases. Mol. Biol. Evol. 21:1455-1458.

Funding