Data from: GT-seq panel development for species identification and parentage analysis of closely related hybridizing Scaphirhynchus sturgeons
Data files
Apr 09, 2026 version files 5.86 GB
-
AdultSturgeonHiSeq1_MD1.nb.html
3.35 MB
-
rawRead_baselineSample.tar.gz
5.86 GB
-
README.md
2.21 KB
Abstract
Hatchery supplementation is vital for conserving dwindling fish populations. Effective augmentation requires distinguishing hatchery-origin from wild individuals and accurately identifying species, particularly in systems where closely related species coexist. Genetic monitoring is key to quantifying genetic differences, but conventional markers struggle to identify hybrids, especially backcrosses. Misidentifying hybrids in hatchery programs compromises wild gene pools because hatchery broodstock contributes to numerous offspring being released into the wild. Here, we present a workflow for developing and evaluating the Genotyping‐in-Thousands by sequencing (GT‐seq) single-nucleotide polymorphism (SNP) panel for North American river sturgeons (Scaphirhynchus spp.). This panel is designed to detect complex hybrid classes and to determine parent-offspring relationships. Our species identification panel (S-loci) contains 155 SNPs selected for high genetic differentiation (FST) between Pallid Sturgeon and Shovelnose Sturgeon, and the parentage assignment panel (P-loci) includes 112 SNPs with high heterozygosity within Pallid Sturgeon. Simulation analyses demonstrated that our GT‐seq S-loci panel reliably classifies pure species, F1, F2, and backcross hybrids, even with up to 70% missing data. The P-loci panel achieves high-confidence parentage assignment with ≥80% typed loci, with performance influenced by the proportion of sampled parents. Overall, the novel Scaphirhynchus GT‐seq panel developed in this study represents a robust and efficient tool for detecting hybridization, assigning parentage, and providing critical information for management decisions in ongoing Pallid Sturgeon conservation.
Dataset DOI: 10.5061/dryad.m0cfxppj6
Description of the data and file structure
This dataset includes all data and documentation necessary to replicate the analyses presented in this study. Raw sequencing reads from baseline samples were used to evaluate the performance of the developed GT-seq SNP marker panel for species identification and parentage analysis. An HTML document, generated from a Markdown file, documents the development of the initial SNP markers using ddRAD-seq data (https://doi.org/10.5061/dryad.mcvdnck1x). Metadata, including baseline sample genotype, sample information, CERVUS parameter settings, and genetic variation metrics, are provided in the supplementary files.
Files and variables
File: AdultSturgeonHiSeq1_MD1.nb.html
Description: Documentation detailing the development of the initial SNP markers using ddRAD-seq data.
File: supplementaryFigure.docx
Description: Supplementary figures associated with the manuscript.
File: supplementaryTable.xlsx
Description: Supplementary tables from the manuscript, including genotype data, sample metadata, analysis parameter settings, and genetic variation metrics.
File: rawRead_baselineSample.tar.gz
Description: Raw sequencing reads from baseline samples were used to evaluate the performance of the newly developed GT-seq marker panel for species identification and parentage analysis. These FASTQ files were generated by amplicon sequencing, specifically GT-seq. The files are named by sample ID, corresponding to the “TableS2” sheet in the supplementary table. Sample: Unique ID for each fish (e.g., LMB-011). Baseline genotype data from 428 Pallid and Shovelnose sturgeon samples were used to optimize and validate Scaphirhynchus GT-seq genetic markers.
Access information
Other publicly accessible locations of the data (ddRAD-seq data):
