Skip to main content
Dryad

StarBeast3: Adaptive parallelised Bayesian inference under the multispecies coalescent

Cite this dataset

Douglas, Jordan (2022). StarBeast3: Adaptive parallelised Bayesian inference under the multispecies coalescent [Dataset]. Dryad. https://doi.org/10.5061/dryad.f1vhhmgzk

Abstract

As genomic sequence data becomes increasingly available, inferring the phylogeny of the species as that of concatenated genomic data can be enticing. However, this approach makes for a biased estimator of branch lengths and substitution rates and an inconsistent estimator of tree topology. Bayesian multispecies coalescent methods address these issues. This is achieved by constraining a set of gene trees within a species tree and jointly inferring both under a Bayesian framework. However, this approach comes at the cost of increased computational demand. Here, we introduce StarBeast3 -- a software package for efficient. Bayesian inference under the multispecies coalescent model via Markov chain Monte Carlo. We gain efficiency by introducing cutting-edge proposal kernels and adaptive operators, and StarBeast3 is particularly efficient when a relaxed clock model is applied. Furthermore, gene tree inference is parallelised, allowing the software to scale with the size of the problem. We validated our software and benchmarked its performance using three real and two synthetic datasets. Our results indicate that StarBeast3 is up to one-and-a-half orders of magnitude faster than StarBeast2, and therefore more than two orders faster than *BEAST, depending on the dataset and on the parameter, and can achieve convergence on large datasets with hundreds of genes. StarBeast3 is open-source and is easy to set up with a friendly graphical user interface.

Methods

This appendix summarises the outputs of Bayesian MCMC on biological and simulated datasets, as well as supplementary methodological details.

Funding