Skip to main content
Dryad

Coping with ineffective overlap in multilocus phylogenetics

Data files

Jun 27, 2025 version files 1.52 GB

Click names to download individual files

Abstract

Missing data is a long-standing issue in phylogenetic inference, which often results in high levels of taxonomic instability, obscuring otherwise well-supported relationships. Multiple approaches have been developed to deal with the negative effects of ineffective overlap on tree resolution, often by identifying taxa for removal. Here we repurpose a heuristic method developed to identify unstable taxa in morphological data matrices, concatabominations, and combine it with a novel gene-tree jackknifing on matrix representation of trees to identify candidates for targeted sequencing. Using a multilocus caecilian dataset, we illustrate the method's capacity to identify candidate taxa and loci for additional sequencing, compare the results to those of the mathematics-based gene sampling sufficiency approach, and explore the terrace space associated with the multilocus dataset. We show that our approach yields tractable numbers of loci/taxa for targeted sequencing that successfully mitigate topological instability due to ineffective overlap, even when modest amounts of data are added.