Molecular sequencing and morphological identification reveal similar patterns in native bee communities across public and private grasslands of eastern North Dakota
Data files
Dec 19, 2019 version files 1.91 MB
-
NDbeesequences.fasta
-
ReadMe.txt
-
sequencedatalong.csv
-
specimendatalong.csv
Abstract
Bees play a key role in the functioning of human-modified and natural ecosystems by pollinating agricultural crops and wild plant communities. Global pollinator conservation efforts need large-scale and long-term monitoring to detect changes in species’ demographic patterns and shifts in bee community structure. The objective of this project was to test a molecular sequencing pipeline that would utilize a commonly used locus, produce accurate and precise identifications consistent with morphological identifications, and generate data that are both qualitative and quantitative. We applied this amplicon sequencing pipeline to native bee communities sampled across Conservation Reserve Program (CRP) lands and native grasslands in eastern North Dakota. We found the 28S LSU locus to be more capable of discriminating between species than the 18S SSU rRNA locus, and in some cases even resolved instances of cryptic species or morphologically ambiguous species complexes. Overall, we found the amplicon sequencing method to be a qualitatively accurate representation of the sampled bee community richness and species identity, especially when a well-curated database of known 28S LSU sequences is available. Both morphological identification and molecular sequencing revealed similar patterns in native bee community structure across CRP lands and native prairie. Additionally, a genetic algorithm approach to compute taxon-specific correction factors using a small subset of the most concordant samples demonstrated that a high level of quantitative accuracy could be possible if the specimens are fresh and processed soon after collection. Here we provide a first step to a molecular pipeline for identifying insect pollinator communities. This tool should prove useful for future national monitoring efforts as use of molecular tools becomes more affordable and as numbers of 28S LSU sequences for pollinator species increase in publicly-available databases.
Methods
Pollinator communities were sampled from four paired grassland “locations” within the Prairie Pothole Region of eastern North Dakota in 2012 and 2013. At each NPAM and CRP location, 20 transects were established at least 65 meters from each other, and 5 meters away from fences, wetlands, and shelterbelts. Transects were sampled approximately once every other week from May to September in 2012 and 2013, with a sample consisting of a single unscented Springstar blue vane trap placed at one fixed end of the transect for 24 hours. Pollinator specimens were identified morphologically to the lowest possible taxonomic level at the NPWRC invertebrate lab in Jamestown, North Dakota. For molecular identification, one mesothoracic leg from each specimen was placed into a 2.0 ml microcentrifuge tube (one tube per sample), and genomic DNA was extracted and purified with a bead bashing protocol. PCR Amplicons for the 18S and 28S loci were amplified and sequenced on the Illumina MiSeq using 2x300 bp paired-end reads. These paired reads were merged into one read, filtered to remove any reads with at least one predicted error, and dereplicated using USEARCH -fastq_mergepairs -fastq_filter -fastx_uniques commands.
Usage notes
Files associated with Darby et al. "Molecular sequencing and morphological identification reveal similar patterns in native bee communities across public and private grasslands of eastern North Dakota":
~~~~~~~~~~~~~~~~~
NDbeesequences.fasta
-FASTA-formated file containing all new and unique bee sequences obtained from 18S and 28S amplicon sequencing
~~~~~~~~~~~~~~~~~~~
sequencedatalong.csv
-comma-delimited file containing the sequencing read counts for all community samples:
--Sample: unique internal identifier for each community sample (1-224)
--Location: one of four sampling locations (Arrowwood, Kulm, SullysHill, Tewaukon)
--Management: one of two grassland management regimes (CRP - Conservation Reserve Program, or NPAM - Northern Prairie Adaptive Management)
--Date: Sampling Date (MM/DD/YYY)
--Year: Sampling Year (2012 or 2013)
--Month: Sampling Month (integer 5 through 9)
--Species: Species name, code, and locus (corresponding to fasta headers in "NDbeesequences.fasta" file)
--Locus: one of two genetic loci (18S [small subunit rRNA] or 28S [large subunit rRNA])
--Readcount: number of high-throughput sequencing reads matching to the species.
~~~~~~~~~~~~~~~~~~~~
sequencedatalong.csv
-comma-delimited file containing the specimen counts for all community samples:
--Location: one of four sampling locations (Arrowwood, Kulm, SullysHill, Tewaukon)
--Management: one of two grassland management regimes (CRP - Conservation Reserve Program, or NPAM - Northern Prairie Adaptive Management)
--Date: Sampling Date (MM/DD/YYY)
--Species: Species name, code, and locus (corresponding to fasta headers in "NDbeesequences.fasta" file)
--Specimencount: number of specimens identified morphological to the species.