DNA barcoding is currently unreliable for species identification in Crayfish
Data files
Dec 07, 2023 version files 124.47 KB
-
Allison_etal_DRYAD.zip
-
README.md
Abstract
DNA barcoding is commonly used for species identification. Despite this, there has not been a comprehensive assessment of the utility of DNA barcoding in crayfishes (Decapoda: Astacidea). Here we examined the extent to which local barcoding gaps (used for species identification) and global barcoding gaps (used for species discovery) exist among crayfishes, and whether global gaps, if present, met a previously suggested 10 × threshold.
Using publicly available mitochondrial COI sequence data from the National Center for Biotechnology Information’s nucleotide database, we created two versions of the COI datasets used for downstream analyses: one focused on the number of unique haplotypes (NH) per species, and another that focused on total number of sequences (NS; i.e., including redundant haplotypes) per species. Ultimately, a total of 81 species were included, with 58 species and five genera from family Cambaridae and 23 species from three genera from family Parastacidae.
We found that local barcoding gaps were present in only 30 species (20 members of Cambaridae and 10 Parastacidae). Global barcoding gaps were detected in only four genera (Cambarus, Cherax, Euastacus, and Tenuibranchiurus), and they were all well below the previously suggested 10× threshold. We propose that a ~5x threshold could act as a more appropriate working hypothesis for species discovery. While the NH and NS datasets yielded largely similar results, there were some discrepant inferences.
Currently, the utility of DNA barcoding for species identification and discovery in crayfish is quite limited, and caution should be exercised when molecular approaches are used in place of taxonomic expertise.
Assessment of the evidence for local and global barcoding gaps is important for understanding the reliability of molecular species identification and discovery, but outcomes are dependent on the current state of taxonomy. As this improves (e.g., via resolving species complexes, possibly elevating some subspecies to the species-level status, and redressing specimen misidentifications in natural history and other collections), so too will the utility of DNA barcoding.
README: Mitochondrial DNA (COI gene) sequences downloaded from NCBI's nucloetide database, aligned, trimmed, and analyzed by Allison et al.
One folder contains seqeunce alignments for crayfish genera that were used for analyses focused on the numbner of unique haplotypes (ie., the NH Datasets), and other other folder contains sequence alignments for crayfish genera that were used for analyses focused on the numbner sequences (i.e., the NS Datasets). All sequence alignments are in FASTA (.fas) file format.
Folder: NH_Dataset
This contains COI sequence alignments for each of seven genera. Haplotype names are arbitrary.
-Cambarus_NH.fas: This alignment is 654-bp long and contains 266 unique haplotypes for members of the genus Cambarus.
-Cherax_NH.fas: This alignment is 594-bp long and contains 208 unique haplotypes for the genus Cherax.
-Creaserinus_NH.fas: This alignment is 579-bp long and contains 105 unique haplotypes for the genus Creaserinus.
-Euastacus_NH.fas: This alignment is 654-bp long and contains 103 unique haplotypes for the genus Euastacus.
-Faxonius_NH.fas: This alignment is 626-bp long and contains 245 unique haplotypes for the genus Faxonius.
-Lacunicambarus_NH.fas: This alignment is 585-bp long and contains 99 unique haplotypes for the genus Lacunicambarus.
-Procambarus_NH.fas: This alignment is 630-bp long and contains 173 unique haplotypes for the genus Procambarus.
Folder: NS_Dataset
This contains COI sequence alignments for each of eight genera. Sequence names are NCBI accession numbers.
-Cambarus_NS.fas: This alignment is 654-bp long and contains 418 seqeunces for members of the genus Cambarus.
-Cherax_NS.fas: This alignment is 594-bp long and contains 365 seqeunces for members of the genus Cherax.
-Creaserinus_NS.fas: This alignment is 579-bp long and contains 130 seqeunces for members of the genus Creaserinus.
-Euastacus_NS.fas: This alignment is 654-bp long and contains 168 seqeunces for members of the genus Euastacus.
-Faxonius_NS.fas: This alignment is 626-bp long and contains 831 seqeunces for members of the genus Faxonius.
-Lacunicambarus_NS.fas: This alignment is 520-bp long and contains 131 seqeunces for members of the genus Lacunicambarus.
-Procambarus_NS.fas: This alignment is 630-bp long and contains 689 seqeunces for members of the genus Procambarus.
-Tenuibranchiurus_NS.fas: This alignment is 644-bp long and contains 87 seqeunces for members of the genus Tenuibranchiurus.
Methods
We searched the NCBI nucleotide database for crayfish COI sequences using the following search terms and Boolean operators: “genus name" AND “cytochrome” OR “COI” OR “COX1”. All searches were conducted, and sequences downloaded, between March and December 2022. We aligned sequences from each genus separately using MUSCLE (Edgar, 2004), implemented in MEGA v.7.026 (Kumar, Stecher, & Tamura, 2016).