Skip to main content
Dryad

Data from: How many characters are needed to reconstruct a phylogeny?

Data files

Sep 26, 2025 version files 257.88 MB

Click names to download individual files

Abstract

Despite increased recent attention towards Bayesian phylogenetics and its applications in understanding macroevolutionary processes, it remains unclear how many discrete characters are needed to accurately estimate tree topologies in a Bayesian framework. This could be particularly relevant for morphological datasets used in phylogenetics, as they usually consist of few dozens to few hundreds of characters—orders of magnitude smaller than most molecular datasets. I designed a simulation study in the software RevBayes to explore how the number of sampled discrete characters affects accuracy and precision of Bayesian phylogenetic estimates, under various setups differing in number of taxa, average number of state changes per character (i.e., tree length), and number of states per character. Results indicate that between 100 and 500 variable characters are necessary to reach sufficient accuracy and precision of phylogenetic estimates for as low as 20 tips. All other parameters being equal, multistate characters produce slightly more accurate estimates than binary characters, and more labile characters produce more accurate estimates for trees above 50 tips. The results of this study highlight the continuous need for global research efforts geared towards the characterization and digitization of interspecific morphological diversity in both extant and extinct taxa.