Skip to main content
Dryad

Data from: TreeFix: statistically informed gene tree error correction using species trees

Cite this dataset

Wu, Yi-Chieh; Rasmussen, Matthew D.; Bansal, Mukul S.; Kellis, Manolis (2012). Data from: TreeFix: statistically informed gene tree error correction using species trees [Dataset]. Dryad. https://doi.org/10.5061/dryad.44cb5

Abstract

Accurate gene tree reconstruction is a fundamental problem in phylogenetics, with many important applications. However, sequence data alone often lack enough information to confidently support one gene tree topology over many competing alternatives. Here, we present a novel framework for combining sequence data and species tree information, and we describe an implementation of this framework in TreeFix, a new phylogenetic program for improving gene tree reconstructions. Given a gene tree (preferably computed using a maximum likelihood phylogenetic program), TreeFix finds a "statistically equivalent" gene tree that minimizes a species tree based cost function. We have applied TreeFix to two clades of 12 Drosophila and 16 fungal genomes, as well as to simulated phylogenies, and show that it dramatically improves reconstructions compared to current state-of-the-art programs. Given its accuracy, speed, and simplicity, TreeFix should be applicable to a wide range of analyses and have many important implications for future investigations of gene evolution. The source code and a sample dataset are available at http://compbio.mit.edu/treefix.

Usage notes