Shokralla, Shadi1; Porter, Teresita M.2; Gibson, Joel F.1; Dobosz, Rafal1; Janzen, Daniel H.3; Hallwachs, Winnie3; Golding, G. Brian2; Hajibabaei, Mehrdad1

Published Feb 18, 2016 on Dryad. https://doi.org/10.5061/dryad.j897m

Abstract

Genetic information is a valuable component of biosystematics, especially specimen identification through the use of species-specific DNA barcodes. Although many genomics applications have shifted to High-Throughput Sequencing (HTS) or Next-Generation Sequencing (NGS) technologies, sample identification (e.g., via DNA barcoding) is still most often done with Sanger sequencing. Here, we present a scalable double dual-indexing approach using an Illumina Miseq platform to sequence DNA barcode markers. We achieved 97.3% success by using half of an Illumina Miseq flowcell to obtain 658 base pairs of the cytochrome c oxidase I DNA barcode in 1,010 specimens from eleven orders of arthropods. Our approach recovers a greater proportion of DNA barcode sequences from individuals than does conventional Sanger sequencing, while at the same time reducing both per specimen costs and labor time by nearly 80%. In addition, the use of HTS allows the recovery of multiple sequences per specimen, for deeper analysis of genetic variation in target gene regions.

Plate01-malaise-BR_S13_L001_R1_001.fastq

Plate01-malaise-BR_S13_L001_R2_001.fastq

Plate01-malaise-FC_S1_L001_R1_001.fastq

Plate01-malaise-FC_S1_L001_R2_001.fastq

Plate02-malaise-BR_S14_L001_R1_001.fastq

Plate02-malaise-BR_S14_L001_R2_001.fastq

Plate02-malaise-FC_S2_L001_R1_001.fastq

Plate02-malaise-FC_S2_L001_R2_001.fastq

Plate03-malaise-BR_S15_L001_R1_001.fastq

Plate03-malaise-BR_S15_L001_R2_001.fastq

Plate03-malaise-FC_S3_L001_R1_001.fastq

Plate03-malaise-FC_S3_L001_R2_001.fastq

Plate04-malaise-BR_S16_L001_R1_001.fastq

Plate04-malaise-BR_S16_L001_R2_001.fastq

Plate04-malaise-FC_S4_L001_R1_001.fastq

Plate04-malaise-FC_S4_L001_R2_001.fastq

Plate05-malaise-BR_S17_L001_R1_001.fastq

Plate05-malaise-BR_S17_L001_R2_001.fastq

Plate05-malaise-FC_S5_L001_R1_001.fastq

Plate05-malaise-FC_S5_L001_R2_001.fastq

Plate06-malaise-BR_S18_L001_R1_001.fastq

Plate06-malaise-BR_S18_L001_R2_001.fastq

Plate06-malaise-FC_S6_L001_R1_001.fastq

Plate06-malaise-FC_S6_L001_R2_001.fastq

Plate07-malaise-BR_S19_L001_R1_001.fastq

Plate07-malaise-BR_S19_L001_R2_001.fastq

Plate07-malaise-FC_S7_L001_R1_001.fastq

Plate07-malaise-FC_S7_L001_R2_001.fastq

Plate08-malaise-BR_S20_L001_R1_001.fastq

Plate08-malaise-BR_S20_L001_R2_001.fastq

Plate08-malaise-FC_S8_L001_R1_001.fastq

Plate09-malaise-BR_S21_L001_R1_001.fastq

Plate08-malaise-FC_S8_L001_R2_001.fastq

Plate09-malaise-BR_S21_L001_R2_001.fastq

Plate09-malaise-FC_S9_L001_R1_001.fastq

Plate09-malaise-FC_S9_L001_R2_001.fastq

Plate10-malaise-BR_S22_L001_R1_001.fastq

Plate10-malaise-BR_S22_L001_R2_001.fastq

Plate10-malaise-FC_S10_L001_R1_001.fastq

Plate10-malaise-FC_S10_L001_R2_001.fastq

Data from: Massively parallel multiplex DNA sequencing for specimen identification using an Illumina MiSeq platform

Data files