Skip to main content

Data from: Assessing combinability of phylogenomic data using Bayes Factors

Cite this dataset

Neupane, Suman et al. (2019). Data from: Assessing combinability of phylogenomic data using Bayes Factors [Dataset]. Dryad.


With the rapid reduction in sequencing costs of high-throughput genomic data, it has become commonplace to use hundreds of genes to infer phylogeny of any study system. While sampling a large number of genes has given us a tremendous opportunity to uncover previously unknown relationships and improve phylogenetic resolution, it also presents us with new challenges when the phylogenetic signal is confused by differences in the evolutionary histories of sampled genes. Given the incorporation of accurate marginal likelihood estimation methods into popular Bayesian software programs, it is natural to consider using the Bayes Factor (BF) to compare different partition models in which genes within any given partition subset share both tree topology and edge lengths. We explore using marginal likelihood to assess data subset combinability when data subsets have varying levels of phylogenetic discordance due to deep coalescence events among genes (simulated within a species tree), and compare the results with our recently-described phylogenetic informational dissonance index (D) estimated for each data set. BF effectively detects phylogenetic incongruence, and provides a way to assess the statistical significance of D values. We use BFs to assess data combinability using an empirical data set comprising 56 plastid genes from the green algal order Volvocales. We also discuss the potential need for calibrating BFs and demonstrate that BFs used in this study are correctly calibrated.

Usage notes


National Science Foundation, Award: DEB-1354146