Skip to main content
Dryad

Data from: A fuzzy-set-theory-based approach to analyze species membership in DNA barcoding

Data files

Apr 04, 2011 version files 142.40 KB

Abstract

Reliable assignation of an unknown query sequence to its correct species remains a methodological problem for the growing field of DNA barcoding. While great advances have been achieved recently, species identification from barcodes can still be unreliable if the relevant biodiversity has been insufficiently sampled. We here propose a new notion of species membership for DNA barcoding - fuzzy membership, based on fuzzy set theory - and illustrate its successful application to four real datasets (bats, fishes, butterflies and flies) with more than 5000 random simulations. Two of the datasets comprise especially dense species/population level samples. In comparison with current DNA barcoding methods, the newly proposed minimum distance (MD) plus fuzzy set approach, and another computationally simple method, “best close match”, outperform two computationally sophisticated Bayesian and BootstrapNJ methods. The new method proposed here has great power in reducing false positive species identification compared with other methods when conspecifics of the query are absent from the reference database.