Data from: Genetic sex assignment in wild populations using GBS data: a statistical threshold approach
Stovall, William R. et al. (2018), Data from: Genetic sex assignment in wild populations using GBS data: a statistical threshold approach, Dryad, Dataset, https://doi.org/10.5061/dryad.84v2f
Establishing the sex of individuals in wild systems can be challenging and often requires genetic testing. Genotyping-by-sequencing (GBS) and other reduced representation DNA sequencing (RRS) protocols (e.g., RADseq, ddRAD) have enabled the analysis of genetic data on an unprecedented scale. Here, we present a novel approach for the discovery and statistical validation of sex-specific loci in GBS datasets. We used GBS to genotype 166 New Zealand fur seals (NZFS, Arctocephalus forsteri) of known sex. We retained monomorphic loci as potential sex-specific markers in the locus discovery phase. We then used (i) a sex-specific locus threshold (SSLT) to identify significantly male-specific loci within our dataset and (ii) a significant sex-assignment threshold (SSAT) to confidently assign sex in silico the presence or absence of significantly male-specific loci to individuals in our dataset treated as unknowns (98.9% accuracy for females; 95.8% for males, estimated via cross-validation). Furthermore, we assigned sex to 86 individuals of true unknown sex using our SSAT, and assessed the effect of SSLT adjustments on these assignments. From 90 verified sex-specific loci, we developed a panel of three sex-specific PCR primers that we used to ascertain sex independently of our GBS data, which we show amplify reliably in at least three other pinniped species. Using monomorphic loci normally discarded from large SNP datasets is an effective way to identify robust sex-linked markers for non-model species. Our novel pipeline can be used to identify and statistically validate monomorphic and polymorphic sex-specific markers across a range of species and RRS datasets.