Skip to main content
Dryad

ASTRAL-II: coalescent-based species tree estimation with many hundreds of taxa and thousands of genes

Data files

Jun 08, 2023 version files 16.83 GB

Click names to download individual files Select up to 11 GB of files for zip download

Abstract

Motivation: The estimation of species phylogenies requires multiple loci, since different loci can have different trees due to incomplete lineage sorting, modeled by the multi-species coalescent model. We recently developed a coalescent-based method, ASTRAL, which is statistically consistent under the multi-species coalescent model and which is more accurate than other coalescent-based methods on the datasets we examined. ASTRAL runs in polynomial time, by constraining the search space using a set of allowed ‘bipartitions’. Despite the limitation to allowed bipartitions, ASTRAL is statistically consistent.

Results: We present a new version of ASTRAL, which we call ASTRAL-II. We show that ASTRAL-II has substantial advantages over ASTRAL: it is faster, can analyze much larger datasets (up to 1000 species and 1000 genes) and has substantially better accuracy under some conditions. ASTRAL’s running time is O(n^2k|X|^2), and ASTRAL-II’s running time is O(nk|X|^2), where n is the number of species, k is the number of loci and X is the set of allowed bipartitions for the search space.