SNP dataset for 27 populations of Adenocaulon himalaicum (Asteraceae)
Data files
Jul 03, 2025 version files 194.55 MB
-
neutral_SNPs_of_A._himalaicum.vcf
194.55 MB
-
README.md
1.52 KB
Abstract
Understanding adaptive evolution and survival risks in understory herbs is crucial for the effective conservation of biodiversity. It is not well understood how environmental gradients mediate local adaptation and how populations respond to changing environment of understory herbs. In this study, we conducted the population genomic analysis of Adenocaulon himalaicum (Asteraceae) in East Asia, representing a good model of dominant understory herb to unveil the adaptive evolution mechanisms of this species. Based on 34,398 single nucleotide polymorphisms (SNPs) across 27 populations, we identified three genetic lineages companied with high levels genetic differentiation between populations. Our isolation by environment results (IBE) indicated the significant effect of environmental gradients on genomic variation of A. himalaicum (r = 0.18, p = 0.03). Redundancy analysis (RDA) revealed that the genomic variation were significantly explained by mean diurnal range (bio2), precipitation of driest quarter (bio17) and precipitation of warmest quarter (bio18) and soil type (TAXNWRB). Using two genotype-environment association (GEA) methods, we identified 27 SNPs as candidates for core climate-related adaption loci and two loci of these were further validated by qRT-RCR experiments. Furthermore, predictions to spatiotemporal pattern of genomic offsets under different future climate scenarios revealed that populations near southeastern edge of Himalaya, Yunnan-Guizhou Plateau, northern Korean Peninsula, and southern Japan were regarded as the most vulnerable populations. Therefore, conservation priority is needed for these populations due to candidate variants and unique genetic compositions being useful for future breeding.
https://doi.org/10.5061/dryad.j6q573nq6
Description of the data and file structure
A VCF (Variant Call Format) file is a text file storing genetic variation data, like SNPs or insertions/deletions.
Files and variables
File: neutral_SNPs_of_A._himalaicum.vcf
The dataset is structured with individual VCF files for 27 Adenocaulon himalaicum populations from 221 individuals. Each VCF file contains meta-information lines, followed by a header line specifying eight fixed fields per sample and subsequent data lines detailing genomic positions. The fixed fields include information such as chromosome (CHROM), position (POS), identifier (ID), the reference base(s) (REF), alternate base(s) (ALT), quality (QUAL), filter status (FILTER), and additional information (INFO). Genotypes are encoded as allele values separated by ‘/’ or ‘l’ to indicate unphased or phased data, respectively.
We filtered the loci according to loci with minor allele frequencies <0.05, and a maximum observed heterozygosity was set at 0.5 to reduce the potential paralogs. The polymorphic loci present in at least 85% of the individuals were retained.
This file shows the 34,398 SNPs identified by Stacks v2.41 for 221 individuals of Adenocaulon himalaicum.
Code/Software
These files can all be opened with a standard text editor, such as Notepad. It can also be opened with Tassel as vcf files.