Detecting genetic signals of selection in heavily bottlenecked reindeer populations by comparing parallel founder events


De Jong, Menno; Lovatt, Fiona; Hoelzel, Rus (2021), Detecting genetic signals of selection in heavily bottlenecked reindeer populations by comparing parallel founder events, Dryad, Dataset,


Founder populations are of special interest to both evolutionary and conservation biologists, but the detection of genetic signals of selection in these populations is challenging due to their demographic history. Geographically separated founder populations likely subjected to similar selection pressures provide an ideal but rare opportunity to overcome these challenges. Here we take advantage of such a situation generated when small, isolated founder populations of reindeer were established on the island of South Georgia, and using this system we look for empirical evidence of selection overcoming strong genetic drift. We generated a 70K ddRADseq SNP database for the two parallel reindeer founder populations and screened for signatures of soft sweeps. We find evidence for a genomic region under selection shared among the two populations, and support our findings with Wright-Fisher model simulations to assess the power and specificity of interpopulation selection scans – i.e. Bayescan, OutFLANK, PCadapt and a newly developed scan called Genome Wide Differentiation Scan (GWDS) – in the context of pairwise source-founder comparisons. Our simulations indicate that loci under selection in small founder populations are most likely detected by GWDS, and strengthen the hypothesis that the outlier region represents a true locus under selection. We explore possible, relevant functional roles for genes in linkage with the detected outlier locus.


SNP datasets containing genotype information for samples taken from the two introduced populations of reindeer (Rangifer tarandus) which occurred until recently on the south-Atlantic island South Georgia. The dataset also includes samples from their Norwegian source population. Sequencing libraries were generated with the ddRADseq protocol, and SNPs were subsequently called using the STACKS2.2 refmap pipeline, using the reindeer genome as reference. The data contains multiple SNPs per read pair. 

The data is stored in PED and MAP format (plink format). Each population is represented by ~30 individuals, as specified in the popfile. 'Busen' and 'Barff' are the names of the two peninsulas on which the two South Georgia reindeer populations occured. Also included in the download are read depth depth information files generated with vcftools.

Usage Notes

The 'reindeer_commands.txt' file contains all the commands used to analyse the SNP dataset, to generate the main plots in the associated research article, and to perform simulations to test the performance of selection scans and the detectability of SNPs under selection. The commands make use of functions implemented in the R package SambaR, which can be downloaded from Github (


The British Deer Society

The Kenneth Whitehead Trust

The Kenneth Whitehead Trust