Skip to main content
Dryad

Data from: Inferring heterogeneous evolutionary processes through time: from sequence substitution to phylogeography

Cite this dataset

Bielejec, Filip et al. (2014). Data from: Inferring heterogeneous evolutionary processes through time: from sequence substitution to phylogeography [Dataset]. Dryad. https://doi.org/10.5061/dryad.qp747

Abstract

Molecular phylogenetic and phylogeographic reconstructions generally assume time-homogeneous substitution processes. Motivated by computational convenience, this assumption sacrifices biological realism and offers little opportunity to uncover the temporal dynamics in evolutionary histories. Here, we propose an evolutionary approach that explicitly relaxes the time-homogeneity assumption by allowing the specification of different infinitesimal substitution rate matrices across different time intervals, called epochs, along the evolutionary history. We focus on an epoch model implementation in a Bayesian inference framework that offers great modeling flexibility in drawing inference about any discrete data type characterized as a continuous-time Markov chain, including phylogeographic traits. To alleviate the computational burden that the additional temporal heterogeneity imposes, we adopt a massively parallel approach that achieves both fine- and coarse-grain parallelization of the computations across branches that accommodate epoch transitions, making extensive use of graphics processing units. Through synthetic examples, we assess model performance in recovering evolutionary parameters from data generated according to different evolutionary scenarios that comprise different numbers of epochs for both nucleotide and codon substitution processes. We illustrate the usefulness of our inference framework in two different applications to empirical data sets: the selection dynamics on within-host HIV populations throughout infection and the seasonality of global influenza circulation. In both cases, our epoch model captures key features of temporal heterogeneity that remained difficult to test using ad hoc procedures.

Usage notes