Scutellaria floridana ddRAD-Seq vcf files
Data files
Dec 16, 2025 version files 207.70 MB
-
populations.haps.vcf
10.65 MB
-
populations.snps.vcf
197.04 MB
-
README.md
1.88 KB
Abstract
The threatened mint Florida skullcap (Scutellaria floridana) is endemic to four counties in the Florida panhandle. Because development and habitat modification extirpated several historical occurrences, only 19 remain to date. To inform conservation management and delisting decisions, a comprehensive investigation of the genetic diversity and relatedness, population structure, and clonal diversity was conducted using SNP data generated by ddRAD. Compared with other Lamiaceae, we detected low genetic diversity (HE = 0.125‐0.145), low to moderate evidence of inbreeding (FIS = ‐0.02‐0.555), and moderate divergence (FST = 0.05‐0.15). We identified eight populations with most of the genetic diversity, which should be protected in situ, and four populations with low genetic diversity and high clonality. Clonal reproduction in our circular plots and in 92% of the sites examined was substantial, with average clonal richness of 0.07 and 0.59, respectively. Scutellaria floridana appears to have experienced a continued decline in the number of extant populations since its listing under the Endangered Species Act; still, the combination of sexual and asexual reproduction may be advantageous for maintaining the viability of extant populations. However, the species will likely require ongoing monitoring, management, and increased public awareness to ensure its survival and effectively conserve its genetic diversity.
These are haplotype and SNP vcf files derived from ddRAD-seq of 284 individuals across 17 populations of Scutellaria floridana, the Florida skullcap, in the Lamiaceae. The final dataset that we used for downstream analysis included 10,223 loci and 28,210 variable sites that were each present in at least 60% of individuals.
Description of the data and file structure
populations.haps.vcf: Individual genotypes by haplotype, organized by chromosome number (ddRadseq sequenced fragment). Individuals are labeled by population and individual number. For example, Sf31.2 is Population 31, individual 2. Some populations had multiple sites within a population. In this case, the site number is also included. For example, Sf9.1.1 is Population 9, site 1, individual 1.
populations.snps.vcf: Individual genotypes by SNP. Individuals are labeled by population and individual number. For example, Sf31.2 is Population 31, individual 2. Some populations had multiple sites within a population. In this case, the site number is also included. For example, Sf9.1.1 is Population 9, site 1, individual 1.
Sharing/Access information
n/a
Code/Software
To assess within‐population genetic diversity, we calculated heterozygosity and inbreeding coefficients for each population using the R package hierf‐stat. To assess genetic differentiation between populations, we calculated pairwise FST for populations using the package StaMPP. To investigate isolation by distance, we ran a Mantel test for a significant relationship between pairwise FST and geographic distance between populations using the package ade4. We estimated ancestry coefficients for individuals via an sNMF analysis using the package LEA and performed a discriminate analysis of principal components (DAPC) using the R package hierfstat.
We extracted total genomic DNA from our plant material samples. Briefly, frozen samples were finely ground in liquid N2 and dissolved in an extraction buffer containing 100mM Tris, pH 8.0, 50 mM EDTA, 500 mM NaCl, and 0.1% W:V PVP 40, followed by 5M potassium acetate precipitation of cellular debris and isopropanol precipitation of genomic DNA. We assessed the quality of the DNA from the samples using gel electrophoresis on a 1.5% agarose gel in Tris‐Acetate‐EDTA buffer to ensure there was little to no DNA degradation. We estimated the quantity of DNA in our samples using a Qubit 4 fluorometer (Thermo Fisher Scientific, Waltham, MA).
Samples that displayed adequate quality and reached a minimum DNA concentration of 20 ng/ul were then sent to Floragenex (Floragenex, Inc, 4640 SW Macadam Ave, Portland, OR), where double‐digest restriction site associated DNA sequencing (ddRAD-Seq) was carried out. To summarize, DNA was first digested using the restriction endonucleases PstI and MseI. Samples were diluted for PCR amplification and the product was used to construct a ddRAD‐Seq library. The library was sequenced at the University of Oregon Genomics and Cell Characterization Core Facility (GC3F) on a NovaSeq 6000 with a SP100 chip, generating 118 bp single end reads with a mean 27.5x effective coverage per sample. The sequence data was run through the pipeline STACKS (version 2.60) to assemble the short‐read sequences from all the samples (via the process radtags program), and to align reads into loci that are genotyped (via the gstacks program). Single nucleotide polymorphism data was exported in VCF version 4.2 file format for downstream data analysis. Three quality cut‐off filters were applied allowing for genotypes present in 40%, 60%, or 80% of individuals. We used a dataset in which each locus was represented in at least 60% of individuals; datasets with less missing data (found in 80% of individuals) resulted in a loss of informative loci.
- Hanko, Gina Renee; Vogel, Maria Therese; Negrón-Ortiz, Vivian; Moore, Richard C. (2023). High Prevalence of Clonal Reproduction and Low Genetic Diversity in Scutellaria floridana, a Federally Threatened Florida-Endemic Mint. Plants. https://doi.org/10.3390/plants12040919
