Skip to main content
Dryad

Dataset for: Smith et al., Phylogenomic analysis of the parrots of the world distinguishes artifactual from biological sources of gene tree discordance

Cite this dataset

Smith, Brian Tilston (2022). Dataset for: Smith et al., Phylogenomic analysis of the parrots of the world distinguishes artifactual from biological sources of gene tree discordance [Dataset]. Dryad. https://doi.org/10.5061/dryad.b5mkkwhfm

Abstract

Gene tree discordance is expected in phylogenomic trees and biological processes are often invoked to explain it. However, heterogeneous levels of phylogenetic signal among individuals within datasets may cause artifactual sources of topological discordance. We examined how the information content in tips and subclades impacts topological discordance in the parrots (Order: Psittaciformes), a diverse and highly threatened clade of nearly 400 species. Using ultraconserved elements from 96% of the clade's species-level diversity, we estimated concatenated and species trees for 382 ingroup taxa. We found that discordance among tree topologies was most common at nodes dating between the late Miocene and Pliocene, and often at the taxonomic level of genus. Accordingly, we used two metrics to characterize information content in tips and assess the degree to which conflict between trees was being driven by lower quality samples. Most instances of topological conflict and non-monophyletic genera in the species tree could be objectively identified using these metrics. For subclades still discordant after tip-based filtering, we used a machine learning approach to determine whether phylogenetic signal or noise was the more important predictor of metrics supporting the alternative topologies. We found that when signal favored one of the topologies, noise was the most important variable in poorly performing models that favored the alternative topology. In sum, we show that artifactual sources of gene tree discordance, which are likely a common phenomenon in many datasets, can be distinguished from biological sources by quantifying the information content in each tip and modeling which factors support each topology.

Methods

Please see Materials and Methods section of the main text and Parrots_UCE_variant_calling_and_other_methods.txt.

Usage notes

See README.txt

Funding

National Science Foundation, Award: DEB-1655736

National Science Foundation, Award: DBI-2029955

National Science Foundation, Award: DEB-1557053