Data from: Whole-genome analyses resolve the phylogeny of flightless birds (Palaeognathae) in the presence of an empirical anomaly zone
Cloutier, Alison et al. (2019), Data from: Whole-genome analyses resolve the phylogeny of flightless birds (Palaeognathae) in the presence of an empirical anomaly zone, Dryad, Dataset, https://doi.org/10.5061/dryad.fj02s0j
Palaeognathae represent one of the two basal lineages in modern birds, and comprise the volant (flighted) tinamous and the flightless ratites. Resolving palaeognath phylogenetic relationships has historically proved difficult, and short internal branches separating major palaeognath lineages in previous molecular phylogenies suggest that extensive incomplete lineage sorting (ILS) might have accompanied a rapid ancient divergence. Here, we investigate palaeognath relationships using genome-wide data sets of three types of noncoding nuclear markers, together totalling 20,850 loci and over 41 million base pairs of aligned sequence data. We recover a fully resolved topology placing rheas as the sister to kiwi and emu + cassowary that is congruent across marker types for two species tree methods (MP-EST and ASTRAL-II). This topology is corroborated by patterns of insertions for 4,274 CR1 retroelements identified from multi-species whole genome screening, and is robustly supported by phylogenomic subsampling analyses, with MP-EST demonstrating particularly consistent performance across subsampling replicates as compared to ASTRAL. In contrast, analyses of concatenated data supermatrices recover rheas as the sister to all other non-ostrich palaeognaths, an alternative that lacks retroelement support and shows inconsistent behavior under subsampling approaches. While statistically supporting the species tree topology, conflicting patterns of retroelement insertions also occur and imply high amounts of ILS across short successive internal branches, consistent with observed patterns of gene tree heterogeneity. Coalescent simulations indicate that the majority of observed topological incongruence among gene trees is consistent with coalescent variation rather than arising from gene tree estimation error alone, and estimated branch lengths for short successive internodes in the inferred species tree fall within the theoretical range encompassing the anomaly zone. Distributions of empirical gene trees confirm that the most common gene tree topology for each marker type differs from the species tree, signifying the existence of an empirical anomaly zone in palaeognaths.