Data from: Upstream analyses create problems with DNA-based species delimitation
Data files
Dec 12, 2013 version files 18.74 MB
-
datafiles.zip
-
SupplementaryFigs&Table.pdf
Abstract
Genetic-based delimitation of species typically involves a multistep process in which DNA data are analyzed with a series of different programs. Although the performance of the programs associated with each step has been evaluated separately, no analysis has considered how errors in the upstream assignment of individuals to putative species impacts the accuracy of species delimited in downstream analyses, such as those associated with the coalescent-based Bayesian program bpp. Here we show that because the minimal data requirements for accurate performance in each of the separate steps involved in the delimitation process differ, the reliability of inferences about species delimited from DNA sequences can be compromised. Our results provide important insights into the practice of species delimitation. Specifically, even if users exercise the practices advocated for DNA-based delimitation, there may very well be errors in individual-species association, and consequently uncertainty in the guide tree (both derived from upstream analyses that are prerequisites for analyses with bpp), which can lead to under or overestimation of biodiversity, even though the Bayesian program bpp itself may perform very well. These results highlight the usefulness of complementary data (i.e., data in addition to genetic data), especially for the assignment of individuals to putative species, to improve the accuracy of species delimitation.