Utilizing field collected insects for next generation sequencing: effects of sampling, storage, and DNA extraction methods
Cite this dataset
Ballare, Kimberly et al. (2020). Utilizing field collected insects for next generation sequencing: effects of sampling, storage, and DNA extraction methods [Dataset]. Dryad. https://doi.org/10.7291/D1CD4P
DNA sequencing technologies continue to advance the biological sciences, expanding opportunities for genomic studies of non-model organisms for basic and applied questions. Despite these opportunities, many next-generation sequencing protocols have been developed assuming a substantial quantity of high molecular weight DNA (>100 ng), which can be difficult to obtain for many study systems. In particular, the ability to sequence field-collected specimens that exhibit varying levels of DNA degradation remains largely unexplored. In this study we investigate the influence of five traditional insect capture and curation methods on Double-Digest Restriction Enzyme Associated DNA (ddRAD) sequencing success for three wild bee species. We sequenced a total of 105 specimens (between 7-13 specimens per species and treatment). We additionally investigated how different DNA quality metrics (including pre-sequence concentration and contamination) predicted downstream sequencing success, and also compared two DNA extraction methods. We report successful library preparation for all specimens, with all treatments and extraction methods producing enough highly reliable loci for population genetic analyses. Although results varied between species, we found that specimens collected by net sampling directly into 100% EtOH or by passive trapping followed by 100% EtOH storage before pinning tended to produce higher quality ddRAD assemblies, likely as a result of rapid specimen desiccation. Surprisingly, we found that specimens preserved in propylene glycol during field sampling exhibited lower-quality assemblies. We provide recommendations for each treatment, extraction method, and DNA quality assessment, and further encourage researchers to consider utilizing a wider variety of specimens for genomic analyses.
Specimens were captured using three standardized methods including hand netting, blue vane trapping (Stephen & Rao 2007), and pan trapping (Roulston et al. 2007) at 39 sites across Texas during the summers of 2012-2014. All specimens were extracted in Spring 2016 using Qiagen ® DNeasy Blood and Tissue Kit using the standard protocol with a few minor modifications to maximize DNA yield. We extracted approximately 1 cm3 tissue from each specimen, using thoracic tissue from B. pensylvanicus and M. tepaneca, and using the entire specimen for the smaller species L. bardum. . We additionally investigated the DNAzol ® extraction technique for only B. pensylvanicus specimens given the larger size and ample tissue availability. B. pensylvanicus thoraces were divided in half and extracted using a customized DNAzol ® protocol (Chomczynski et al. 1997). One hundred ng of DNA per sample after normalization using PicoGreen ® measurement were digested in 1X NEB Cut Smart Buffer and 100 U each EcoRI-HF and MspI (NEB), for a final volume of 25 µL, at 37°C for 4 hours. Sequencing of all 136 samples was done across two lanes of Illumina HiSeq 2500 operated by the Genomics and Bioinformatics service at Texas A&M University (TAMU), with 125 million reads per lane. De-multiplexing of sequencing reads, trimming of adapters, and removal of barcodes, was performed using bclfastq2 v2.19 (Illumina).
Data includes: 1) a CSV file with individual sample names, sequence read names, treatments, and DNA quality metrics (NanoDrop indices and DNA concentration); 2) tar archive of R data analysis scripts; 3) and tar archives of assembly files and STACKS outputs for Bombus pensylvanicus, Melissodes tepaneca, and Lasioglossum bardum ddRAD assemblies respectively. Each tar archive contains ReadMe files which describe the individual files and usage. See Methods section in Ballare et al. (2019) Ecology and Evolution for further details on analysis methods.