Data from: A method to generate multi-locus barcodes of pinned insect specimens using MiSeq
Data files
Jan 15, 2020 version files 2.98 MB
Abstract
For molecular insect identification, amplicon sequencing methods are recommended because they offer a cost effective approach for targeting small sets of informative genes from multiple samples. In this context, high-throughput multilocus amplicon sequencing has been achieved using the MiSeq Illumina sequencing platform. However, this approach generates short gene fragments of less than 500 bp, which then have to be overlapped using bioinformatics to achieve longer sequence lengths. This increases the risk of generating chimeric sequences or leads to the formation of incomplete loci. Here, we propose a modified nested amplicon sequencing method for targeting multiple loci from pinned insect specimens using the MiSeq Illumina platform. The modification exists in using a three-step nested PCR approach targeting near full length loci in the initial PCR and subsequently amplifying short fragments between 300 and 350 bp for high-throughput sequencing using Illumina chemistry. Using this method, we generated 407 barcode-compliant sequences for three loci from 98 of 138 pinned specimens. This method worked best for pinned specimens aged between 0 - 5 years, with a limit of 10 years for pinned and 14 years for ethanol preserved specimens. Hence, our method overcomes some of the challenges of amplicon sequencing using short read Next Generation Sequencing and improves possibilities to create high quality multilocus barcodes from insect collections.
DNA was extracted from legs of pinned and ethanol preserved specimens and amplified using a modified nested PCR approach in order to allow sequencing on the illumina MiSeq platform. The raw sequence data generated was processed using custom scripts and the consesus sequences analysed as described in the manuscript including blast searching against the GenBank database accessed through the Geneious software program (v 8.1.3).