Data from: Sorting specimen-rich invertebrate samples with cost-effective NGS barcodes: validating a reverse workflow for specimen processing

Wang, Wendy Y.1; Srivathsan, Amrita1; Foo, Maosheng1; Yamane, Seiki K.2; Meier, Rudolf1

Published Jan 11, 2018 on Dryad. https://doi.org/10.5061/dryad.8h950

Data files

Jan 11, 2018 version files 4.89 MB

Abstract

Biologists frequently sort specimen-rich samples to species. This process is daunting when based on morphology, and disadvantageous if performed using molecular methods that destroy vouchers (e.g., metabarcoding). An alternative is barcoding every specimen in a bulk sample and then presorting the specimens using DNA barcodes, thus mitigating downstream morphological work on presorted units. Such a “reverse workflow” is too expensive using Sanger sequencing, but we here demonstrate that is feasible with an NGS barcoding pipeline that allows for cost-effective high throughput generation of short specimen-specific barcodes (313 bp of COI; lab cost <$0.50 per specimen) through Next Generation Sequencing of tagged amplicons. We applied our approach to a large sample of tropical ants, obtaining barcodes for 3290 of 4032 specimens (82%). NGS barcodes and their corresponding specimens were then sorted into molecular operational taxonomic units (mOTUs) based on objective clustering and Automated Barcode Gap Discovery (ABGD). High diversity of 88-90 mOTUs (4% clustering) was found and morphologically validated based on preserved vouchers. The mOTUs were overwhelmingly in agreement with morphospecies (match ratio 0.95 at 4% clustering). Because of lack of coverage in existing barcode databases, only 18 could be accurately identified to named species, but our study yielded new barcodes for 48 species, including 28 that are potentially new to science. With its low cost and technical simplicity, the NGS barcoding pipeline can be implemented by a large range of laboratories. It accelerates invertebrate species discovery, facilitates downstream taxonomic work, helps with building comprehensive barcode databases, and yields precise abundance information.

Data from: Sorting specimen-rich invertebrate samples with cost-effective NGS barcodes: validating a reverse workflow for specimen processing

Data files

Abstract

NUSants_313bp_COI -- SUPERSEDED - TO BE DELETED AT CURATION

combined all_stats_NUSants

NUSants_313bp_COI_GenBank

Data from: Sorting specimen-rich invertebrate samples with cost-effective NGS barcodes: validating a reverse workflow for specimen processing

Data files

Abstract

Usage notes

NUSants_313bp_COI -- SUPERSEDED - TO BE DELETED AT CURATION

combined all_stats_NUSants

NUSants_313bp_COI_GenBank

Works referencing this dataset