Defining Relictual Biodiversity: Conservation Units in Speckled Dace (Leuciscidae: Rhinichthys osculus) of the Greater Death Valley Ecosystem
Mussmann, Steven; Douglas, Marlis; Oakey, David; Douglas, Michael (2021), Defining Relictual Biodiversity: Conservation Units in Speckled Dace (Leuciscidae: Rhinichthys osculus) of the Greater Death Valley Ecosystem, Dryad, Dataset, https://doi.org/10.5061/dryad.51c59zw62
The tips in the tree of life serve as foci for conservation and management, yet clear delimitations are masked by inherent variance at the species-population interface. Analyses using thousands of nuclear loci can potentially sort inconsistencies, yet standard categories applied to this parsing are themselves potentially conflicting and/or subjective [e.g., DPS (distinct population segments); DUs (Diagnosable Units-Canada); MUs (management units); SSP (subspecies); ESUs (Evolutionarily Significant Units); UIEUs (uniquely identified evolutionary units)]. One potential solution for consistent categorization is to create a comparative framework by accumulating statistical results from independent studies and evaluating congruence among data sets. Our study illustrates this approach in speckled dace (Leuciscidae: Rhinichthys osculus) endemic to two basins (Owens and Amargosa) in the Death Valley ecosystem. These fish persist in the Mojave Desert as isolated Plio-Pleistocene relicts and are of conservation concern, but lack formal taxonomic descriptions/designations. Double-digest RAD (ddRAD) methods identified 14,355 SNP loci across 10 populations (N=140). Species delimitation analyses [multispecies coalescent (MSC) and unsupervised machine learning (UML)] delineated four putative ESUs. FST outlier loci (N=106) were juxtaposed to uncover the potential for localized adaptations. We detected one hybrid population that resulted from upstream reconnection of habitat following contemporary pluvial periods, whereas remaining populations represent relics of ancient tectonism within geographically-isolated springs and groundwater-fed streams. Our study offers three salient conclusions: A blueprint for a multi-faceted delimitation of conservation units; a proposed mechanism by which criteria for intraspecific biodiversity can be potentially standardized; and a strong argument for the proactive management of critically-endangered Death Valley ecosystem fishes.
Whole genomic DNA was extracted using several methods: Gentra Puregene DNA Purification Tissue kit; QIAGEN DNeasy Blood and Tissue Kit; QIAamp Fast DNA Tissue Kit; and CsCl-gradient. Extracted DNA was visualized on 2.0% agarose gels and quantified with a Qubit 2.0 fluorometer (Thermo Fisher Scientific, Inc.). Library preparation followed a double digest Restriction-Site Associated DNA (ddRAD) protocol (Peterson, Weber, Kay, Fisher, & Hoekstra, 2012). Barcoded samples (100 ng DNA each) were pooled in sets of 48 following Illumina adapter ligation, then size-selected at 375-425 bp (Chafin, Martin, Mussmann, Douglas, & Douglas, 2018) using the Pippin Prep System (Sage Science). Size-selected DNA was subjected to 12 cycles of PCR amplification using Phusion high-fidelity DNA polymerase (New England Bioscience) following manufacturer protocols. Subsequent quality checks to confirm successful library amplification were performed via Agilent 2200 TapeStation and qPCR. Final libraries were pooled in sets of three per lane and subjected to 100bp single-end sequencing (Illumina HiSeq 2000, University of Wisconsin Biotechnology Center; and HiSeq 4000, University of Oregon Genomics & Cell Characterization Core Facility).
Libraries were de-multiplexed and filtered for quality using process_radtags (Stacks v1.48; Catchen, Hohenlohe, Bassham, Amores, & Cresko, 2013). All reads with uncalled bases or Phred quality scores < 10 were discarded. Reads with ambiguous barcodes that otherwise passed quality filtering were recovered when possible (= 1 mismatched nucleotide). A clustering threshold of 0.85 (Eaton, 2014) was used for de novo assembly of ddRAD loci in pyRAD v3.0.66. Reads with > 4 low quality bases (Phred quality score < 20) were removed. A minimum of 15 reads was required to call a locus for an individual. A filter was applied to remove putative paralogs using standard methods for their identification in ddRAD data (Eaton, 2014; McKinney, Waples, Seeb, & Seeb, 2017) by discarding loci with heterozygosity > 0.6 and those containing > 10 heterozygous sites. The resulting data were filtered (BCFtools; Li, 2011) as a means of retaining a single biallelic SNP from each locus, as present in at least 33% of individuals (hereafter referred to as ‘SNP-all’). The 33% cutoff was designed to minimize potential bias in missing data for ingroup samples, given the unbalanced basin sampling (e.g., Owens N=50 versus Amargosa N=80). Our desire in this effort was to prevent the more numerous Amargosa samples from dictating which SNPs were recovered during alignment, genotyping, and filtering (Eaton, Spriggs, Park, & Donoghue, 2017; Huang & Knowles, 2016).
Raw reads are available through the NCBI Sequence Read Archive under BioProject ID PRJNA598959.
The files listed below were input files for the various programs used in this manuscript. All are present in the file spd_dv_input_files.tar.gz, along with a readme.txt file which provides this same information.
dv_split_03a.xml = input for BFD
dv_split_04.xml = input for BFD
dv_split_05a.xml = input for BFD
dv_split_05b.xml = input for BFD
dv_split_06a.xml = input for BFD
dv_split_06b.xml = input for BFD
dv_split_06c.xml = input for BFD
dv_split_07a.xml = input for BFD
dv_split_07b.xml = input for BFD
dv_split_07c.xml = input for BFD
dv_split_08.xml = input for BFD
dv_split_amargosa_v_owens.xml = input for BFD
dv_split_lahontan_v_dv.xml = input for BFD
spd_dv.filt2.min4.no_RUP.phy.nex = input for SVDquartets (PAUP)
spd_dv.filt2.min4.phy = input for HyDe
spd_dv.filt2.min4.phy.no_atratulus.str.genepop = input for lositan
spd_dv.filt2.no_RUP.vcf.cf = input for PoMo (IQ-TREE)
spd_dv.filt2.recode.strct_in.str.arp = input for Arlequin after spd_dv.filt2.vcf was filtered by ADMIXPIPE
spd_dv.filt2.vcf = VCF file output by BCFTools after filtering; also served as input for ADMIXPIPE
spd_dv.filt2.vcf.select.recode.strct_in.str.arp = file of SNPs under selection input for Arlequin after being filtered by ADMIXPIPE
spd_dv.filt2.vcf.select.vcf = SNPs under selection; input for ADMIXPIPE
spd_dv.vcf = unfiltered VCF file output by pyRAD
subsampled.no_atratulus.phy.str = input for machine learning algorithms
unlinked.txt.newhybrids = input for newhybrids