Skip to main content
Dryad

Data from: Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood

Cite this dataset

Wu, Yufeng (2011). Data from: Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood [Dataset]. Dryad. https://doi.org/10.5061/dryad.240h7g8r

Abstract

Incomplete lineage sorting can cause incongruence between the phylogenetic history of genes (the gene tree) and that of the species (the species tree), which can complicate the inference of phylogenies. In this paper, I present a new coalescent-based algorithm for species tree inference with maximum likelihood. I first describe an improved method for computing the probability of a gene tree topology given a species tree, which is much faster than an existing algorithm by Degnan and Salter (2005). Based on this method, I develop a practical algorithm that takes a set of gene tree topologies and infers species trees with maximum likelihood. This algorithm searches for the best species tree by starting from initial species trees and performing heuristic search to obtain better trees with higher likelihood. This algorithm, called STELLS, has been implemented in a program that is downloadable from the author’s web page. The simulation results show that the STELLS algorithm is more accurate than an existing maximum likelihood method for many datasets, especially when there is noise in gene trees. I also show that the STELLS algorithm is efficient and can be applied to real biological datasets.

Usage notes