Skip to main content

Data from: Disentangling the effects of geographic peripherality and habitat suitability on neutral and adaptive genetic variation in Swiss stone pine

Cite this dataset

Dauphin, Benjamin et al. (2020). Data from: Disentangling the effects of geographic peripherality and habitat suitability on neutral and adaptive genetic variation in Swiss stone pine [Dataset]. Dryad.


It is generally accepted that the spatial distribution of neutral genetic diversity within a species’ native range mostly depends on effective population size, demographic history, and geographic position. However, it is unclear how genetic diversity at adaptive loci correlates with geographic peripherality or with habitat suitability within the ecological niche. Using exome-wide genomic data and distribution maps of the Alpine range, we first tested whether geographic peripherality correlates with four measures of population genetic diversity at >17,000 SNP loci in 24 Alpine populations (480 individuals) of Swiss stone pine (Pinus cembra) from Switzerland. To distinguish between neutral and adaptive SNP sets, we used four approaches (two gene diversity estimates, FST outlier test, and environmental association analysis) that search for signatures of selection. Second, we established ecological niche models for P. cembra in the study range and investigated how habitat suitability correlates with genetic diversity at neutral and adaptive loci. All estimates of neutral genetic diversity decreased with geographic peripherality, but were uncorrelated with habitat suitability. However, heterozygosity (He) at adaptive loci based on Tajima’s D declined significantly with increasingly suitable conditions. No other diversity estimates at adaptive loci were correlated with habitat suitability. Our findings suggest that populations at the edge of a species' geographic distribution harbour limited neutral genetic diversity due to demographic properties. Moreover, we argue that populations from suitable habitats went through strong selection processes, are thus well adapted to local conditions, and therefore exhibit reduced genetic diversity at adaptive loci compared to populations at niche margins.


We carried out DNA extraction, library preparation, and exome capture as described in Rellstab et al. (2019). Briefly, high-quality DNA of 20 trees per population was used to produce equimolar DNA pools for all 24 populations for pooled sequencing (Pool-Seq; Rellstab, Zoller, Tedder, Gugerli, & Fischer, 2013; Schlötterer, Tobler, Kofler, & Nolte, 2014), which has shown to yield accurate estimates of allele frequencies in this sequencing approach (Rellstab et al., 2019). We generated barcoded libraries (average insert size of 550 bp) using the NEBNext Ultra II DNA Library Prep Kit for Illumina (New England Biolabs, Massachusetts, USA) and subsequently performed probe hybridisation using the MYcroarray myBaits Custom Capture Kit. The 24 hybridised libraries were then sequenced on four lanes of an Illumina HiSeq 4000 (paired-end reads of 150 bp) at the Functional Genomics Center Zurich (FGCZ, Zurich, Switzerland) and Fasteris (Geneva, Switzerland; Table S4).

Following Rellstab et al. (2019), we trimmed and filtered raw reads with TRIMMOMATIC 0.35 (Bolger, Lohse, & Usadel, 2014) using a quality threshold of 20 on both forward and reverse reads. We then mapped the remaining reads back to those transcripts of the reference transcriptome that contained probe bases using BOWTIE 2.3.0 (Langmead, Trapnell, Pop, & Salzberg, 2009), and performed variant (i.e. SNP) and invariant site calling using GATK 3.8 (McKenna et al., 2010) with ploidy set to 40 (i.e. number of chromosomes sequenced per pool of 20 diploid individuals), a coverage ≥40×, and a mapping quality/depth ratio ≥0.25. To get rid of putatively paralogous genes, variant and invariant calling was carried out only for the 4,950 single-copy contigs as determined in Rellstab et al. (2019). These authors used HDplot (McKinney, Waples, Seeb, & Seeb, 2017) to exclude putatively paralogous contigs based on excess heterozygosity and deviation from usual allele balance (read ratio). To conduct population genetic analyses, we assembled a SNP set based on two additional filters to exclude weakly supported SNPs: excluding SNPs with (i) a minor allele frequency (MAF) ≤2.5% across populations (i.e. one chromosome in a pool) and (ii) missing data in at least one population.


Swiss National Science Foundation, Award: 31003A_152664