Supplementary data for: simultaneous species detection and discovery with environmental DNA metabarcoding: a freshwater mollusk case study
Data files
Dec 21, 2023 version files 118.26 KB
-
Alignment_file.fa
-
partition_file.nex
-
README.md
Abstract
Environmental DNA (eDNA) sampling is a powerful tool for rapidly characterizing biodiversity patterns for specious, cryptic taxa with incomplete taxonomies. One such group that are also of high conservation concern are North American freshwater gastropods. In particular, springsnails of the genus Pyrgulopsis (Family: Hydrobiidae) are prevalent throughout the western United States where >140 species have been described. Many of the described species are narrow endemics known from a single spring or locality and it is believed that there are likely many additional species which have yet to be described. The distribution of these species across the landscape is of interest because habitat loss and degradation, climate change, groundwater mining, and pollution have resulted in springsnail imperilment rates as high as 92%. Determining distributions with conventional sampling methods is limited by the fact that these snails are often <5 mm in length with few distinguishing morphological characters, making them both difficult to detect and to identify. In order to facilitate detection of Pyrgulopsis we developed an eDNA metabarcoding protocol that is both inexpensive and capable of rapid, accurate detection of all known Pyrgulopsis species. When compared with conventional collection techniques, our pipeline consistently resulted in detection at sites previously known to contain Pyrgulopsis springsnails and at a cost per site that is likely to be substantially less than the conventional sampling and individual barcoding that has been done historically. Additionally, because our method uses eDNA extracted from filtered water it is non-destructive and suitable for the detection of endangered species where “no take” restrictions may be in effect. This effort represents both a tool which is immediately applicable to a group of high conservation concern across western North America and a case study in the broader application of eDNA sampling for landscape assessments of cryptic taxa of conservation concern.
README: Supplementary Data for: simultaneous species detection and discovery with environmental DNA metabarcoding: a freshwater mollusk case study
https://doi.org/10.5061/dryad.tb2rbp07q
Description of the data and file structure
Dataset includes a 137 taxa alignment of COI sequences in FASTA format (Alignment_file.fa). Sequences were obtained from Genbank, accessions are provided in taxon name. Sequences were aligned using MAFFT 7.215 with default settings. The alignment was then partitioned by codon with each codon position modeled separately in the likelihood analysis described below. Included is the partition_file.nex file used to define these codon positions.
Additionally, this dataset contains the Maximum likelihood phylogeny estimated from this alignment using IQTree 2.2.1 with the following command.
command line: iqtree2 -s Alignment_file.fa -p partition_file.nex -runs 15 -m MFP+MERGE -B 1000 -nt 2
Data was derived from the following sources:
- NCBI GenBank (Nucleic Acids Research, 2013 Jan;41(D1):D36-42)
Methods
Dataset includes a 137 taxa alignment of COI sequences in FASTA format. Sequences were obtained from Genbank, accessions provided in taxon name. Sequences were aligned using MAFFT 7.215 then partitioned by codon. The corresponding partion file is provided as well (partition_file.nex). Additionally, this dataset contains the Maximum likelihood phylogeny (Supplementary_Figure_1_SVG.svg) estimated from this alignment using IQTree 2.1.3 with the following command line:
iqtree2 -s Alignment_file.fa -p partition_file.nex -runs 15 -m MFP+MERGE -B 1000 -nt 2.