Dataset for: Molecular diversity of dragonflies in high altitude Andean lakes through DNA barcoding

Alfaro-Núñez, Alonzo 1 ; Chiriboga-Ortega, Rodrigo2 ; Van der heyden, Christine3 ; Sulen Burgos, Maria Eugenia4 ; Ortega, Sania 2 ; Oña, Tania 2 ; Velarde, Elizabeth2 ; Goethals, Peter5

Published Feb 02, 2021 on Dryad. https://doi.org/10.5061/dryad.rjdfn2z9b

Data files

Feb 02, 2021 version files 36.41 KB

Abstract

Genetic and morphological identification of dragonflies’ larvae species in three high elevation Andean tropical lakes was done using DNA barcoding of the cytochrome oxidase 1 gene (COI). Phylogeny allowed inferring the evolutionary relationships of at least 5 species (from 74 samples) that belong to two different families within the Odonata order.

1. Study area

Three Andean lakes of high elevation; Yahuarcocha, San Pablo and Caricocha were selected as sampling areas to collect the samples. These lakes are located in the Northern part of Ecuador. While, Yahuarcocha lake has a surface area of 2.61 Km² with an elevation of 2200 meters above sea level (m.a.s.l) 00°22´300 N, 78°06´100 W , San Pablo has more than the double in area with 6.4 km² with an elevation of 2660 m.a.s.l 0°13'0" N 78°12'0" W . Finally the largest lake, the Caricocha (belonging to the Mojanda lake system), with an area of 13.29 Km²at 3713 m.a.s.l 00°08´389 N, 78°15´397 W, with the lowest human influence but more than 1000 meters higher than the other two lakes (Cabrera, 2015; Casallas & Gunkel, 2002; Van Colen et al., 2017).

2. Specimen sampling and identification

A total of 74 macroinvertebrates larvae organisms were collected near the shore of each lake between July and September 2016. On each sampling site (see Supplementary file Table 1 for the exact coordinate positions of each sampling point), the samples were collected using professional hand net with wooden handle (2mm mesh Efe & GB nets) for macroinvertebrates (Alba et al., 2005). All samples were preserved in the field using 95% ethanol in a cooler with cooling blocks and stored at the Environmental Research Laboratory (LABINAM) in the North Technical University, Ecuador.

For sample processing and taxonomic identification all individuals were enumerated and identified to the lowest practical taxonomic level using a stereomicroscope LEICA M165C and using available species identification keys (Heckman, 2006).

3. DNA extraction, amplification and sequencing

Total genomic DNA was isolated from head and legs from each sample using the kit PureLinkTM Genomic DNA and following the protocol supplied by the company. The primers COI-F: 5’ATAATTGGRGGRTTYGG RAAYTG-3’ (forward) y COI–R: 5’CCAAARAATCAAA ATAARTGTTG-3’ (reverse) were used for the amplification of 450 bp amplicon of the COI region (Hayashi, Dobata, & Futahashi, 2005; Yong et al., 2014). DNA extract concentrations were measured in a Colibri Microvolume Spectrometer, ranging from 5 to 10 ng/µl in average per sample. DNA region was amplified using Go Taq® Green Master Mix, 2X (Promega Corporation, 2012). The 10 μL PCR reaction mix was composed of 3.7 μL of ultrapure water, 5 μL of Go Taq Green, 0.4 μL of each primer (4 μM) and 0.5 μL f DNA template. The PCR amplification protocol used was: 95°C for 5 min, 35 cycles of denaturation at 95°C for 1 min, Primer annealing at 50°C for 1 min and extension at 72°C for 1:30 min, with a final extension at 72°C for 5 min. The PCR products were visualized on agarose 1.7% gels and checking that the fragments were the correct ones and purified using the kit Wizard® SV Gel and PCR Clean-Up System (Promega Corporation, 2010). The company Macrogen in South Korea carried out DNA sequencing of the amplified fragments.

4. Data analysis

A combination of different strategies were applied to search for most similar sequences in two public databases: GenBank (NCBI, National Centre for Biotechnology Information) and BOLD (Barcode life Data Systems). The identification was made based on the most similar matches (> 99% similarity) (Sonet et al., 2013). The most similar sequences are called "best match" (Meier et al., 2006). In GenBank, we used Basic Local Alignment Search Tool (BLAST) (McClean, 2004). In BOLD, the in-built Identification System (IDS) was applied (Ratnasingham & Hebert, 2007). The obtained sequences of COI were aligned using Geneious Prime ® 2020.2.2 (Kearse et al., 2012) and the online tool (https://mafft.cbrc.jp/alignment/server/index.html) for the MAFFT 7 alignment method. Phylogenetic reconstruction was performed with the Markov Chain Monte Carlo (MCMC) Bayesian approach implemented in BEAST version 2.6.3. Phylogenetic analysis was carried out allowing BEAST package to detect the best substitution model for the dataset alignment. Additionally, the phylogenetic analysis was simultaneously carried out with the GTR+Invariable+Gamma as the best substitution models suggested by the software jModelTest for DNA sequences alignments (Darriba, Taboada, Doallo, & Posada, 2012), which was in agreement with the best substitution model detected by BEAST. A non-parametric Bayesian Skyline (Piecewise-constant) coalescent model with a was used with a Strict molecular clock method. MCMC was developed with 10 million generations, subsampled every 1000 generations by applying a Random as the starting tree.The output of the MCMC analysis was summarized using TreeAnnotator software included in the Beast package. Maximum Clade Credibility tree was produced after discarding 10% of burn-in. The final tree was visualized using FigTree version 1.4.2. (Rambaut, 2012). Maximum Clade Credibility tree was produced after discarding 10 % of burn-in. The final tree was visualized through FigTree version 1.4.4.

Dataset for: Molecular diversity of dragonflies in high altitude Andean lakes through DNA barcoding

Data files

Abstract

Methods

Usage notes

Works referencing this dataset