Skip to main content
Dryad

Data from: A Poissonian model of indel rate variation for phylogenetic tree inference

Cite this dataset

Zhai, Yongliang; Alexandre, Bouchard-Cote (2017). Data from: A Poissonian model of indel rate variation for phylogenetic tree inference [Dataset]. Dryad. https://doi.org/10.5061/dryad.95h17

Abstract

While indel rate variation has been observed and analyzed in detail, it is not taken into account by current indel-aware phylogenetic reconstruction methods. In this work, we introduce a continuous time stochastic process, the geometric Poisson indel process, that generalizes the Poisson indel process by allowing insertion and deletion rates to vary across sites. We design an efficient algorithm for computing the probability of a given multiple sequence alignment based on our new indel model. We describe a method to construct phylogeny estimates from a fixed alignment using neighbor joining. Using simulation studies, we show that ignoring indel rate variation may have a detrimental effect on the accuracy of the inferred phylogenies, and that our proposed method can sidestep this issue by inferring latent indel rate categories. We also show that our phylogenetic inference method may be more stable to taxa subsampling than methods that either ignore indels or indel rate variation.

Usage notes