Skip to main content

Vitamin D status is heritable and under environment-dependent selection in the wild

Cite this dataset

Sparks, Alexandra et al. (2021). Vitamin D status is heritable and under environment-dependent selection in the wild [Dataset]. Dryad.


Vitamin D has a well-established role in skeletal health and is increasingly linked to chronic disease and mortality in humans and companion animals. Despite the clear significance of vitamin D for health and obvious implications for fitness under natural conditions, no longitudinal study has tested whether the circulating concentration of vitamin D is under natural selection in the wild. Here, we show that concentrations of dietary-derived vitamin D and endogenously-produced vitamin D metabolites are heritable and largely polygenic in a wild population of Soay sheep (Ovis aries). Vitamin D status was positively associated with female adult survival, and vitamin D status predicted female fecundity in particular, good environment years when sheep density and competition for resources was low. Our study provides evidence that vitamin D status has the potential to respond to selection, as well as new insights into how vitamin D metabolism is associated with fitness in the wild. 


The full methods on how data was collected and processed is included in the main manuscript associated with this dataset.

Scripts for analysis are available at Further information on all data can be found in the main manuscript and in the file "README_VitD_MS_Phenotype_Dataframe_Descriptions.xlsx".

Usage notes


The attached files contain data derived from the long-term field project monitoring individual Soay sheep on St Kilda and their environment. If you plan to analyse the data for a novel analysis, there are a number of reasons why it would be very helpful if you could contact Josephine Pemberton ( before doing so:

1) If you are interested in analysing the detailed project data in any depth you may find it helpful to have our full relational database rather than the file(s) available here.  If so, then we have a simple process for bringing you onto the project as a collaborator.

2) Occasionally we discover and correct errors in the data. Some sheep identifiers have been recoded over time and should therefore not be linked with data archived from other papers using the Soay sheep data.

3) The data are complex and workers who do not know the study system may benefit from advice when interpreting it.

4) At any one time, quite a few people within the existing project collaboration are analysing data from this project. Someone else may already be conducting the analysis you have in mind and it is desirable to prevent duplication of effort.

5) In order to maintain funding for the project(s), every few years we have to write proposals for original analyses to funding agencies. It is therefore very helpful for those running the project to know what data analyses are in progress.


These data are related to the manuscript MEC-21-0375. Scripts for analysis are available at

 *** DATAFILE "VitD_MS_Phenotype_Data_2021.csv" 
This is the phenotypic data used in the analysis. Missing data is denoted as NA. Headers are as follows:
ID    Individual sheep identifier
Sex    Sex of the individual where 1 is female and 2 is male
Age    Age of the individual in years at time of sampling
AgeGroup    Age grouped (0: lambs, 1: yearlings, 2: 2-6 years, 3: 7+ years)
CoatBin    Coat colour where 1 is dark and 2 is light
Year    Year of sampling/measurement
Total.25D    Plasma concentrations of total 25(OH)D in nmol/l
X25OHD2    Plasma concentrations of 25(OH)D2 in nmol/l
X25OHD3    Plasma concentrations of 25(OH)D3 in nmol/l
Weight    August body mass (kg)
MumID    Mother's individual identifier
BirthYear    Year of birth
Survival    0 = individual died over the subsequent winter following measurement OR 1 = individual survived over winter
EweFecundity    Female breeding success in the spring following sampling
MaleABS    Male breeding success in the spring following sampling
MaleABSBin    Male breeding success in the spring following sampling as a binary variable (0 - did not sire offspring, 1 - did sire offspring)
EweLambingThisSpring    Female lambing status in the spring of the same year (prior to sampling in August) where N = no lamb(s), L0 = had a lamb(s) of which none survived to 3 months, L1 = had a lamb(s) of which at least one survived to 3 months

*** DATAFILE "soayimp_genotype_data.RData"

This is an .RData file for loading into the software R. It contains a GenABEL object that contains all SNP genotype information corresponding to each individual.

*** DATAFILE "Vit_Reg_h2.RData"

This is an .RData file for loading in the software R and contains the following objects necessary to conduct the analysis:

makeGRM - This is a function to create the genomic relatedness matrix from GCTA output.
ASReml.EstEffects - This is a function to extract the heritability and other random effects from ASReml-R models.
VITD - This is the dataset in "VitD_MS_Phenotype_Data_2021.csv" formatted for analysis in R
grminv - This is the genomic relatedness matrix created using GCTA.

*** DATAFILE "4_Updated_Pedigree_Feb2017.txt"

This file contains the pedigree with ID, MOTHER and FATHER. Missing data is NA.


This is a saved version of the Github repository 


Table S9. Full genome-wide association results for total 25(OH)D, 25(OH)D2, 25(OH)D3 plasma concentrations in Soay sheep. A1 and A2 are the reference and alternate allele at each SNP. effB is the slope of the effect of allele A2, with the standard error se_effB. Chi2.1df and P1df  is the association chi-squared statistic and associated P-value, respectively, before correction with genomic control λ. Pc1df is the corrected P-value after genomic control. Exp is the corresponding P-value for that SNP locus assuming a null distribution of P-values (see Figure 2). Q.2 is the minor allele frequency.


Wellcome Trust, Award: 098493/Z/12/Z

Royal Society, Award: UF150448

Biotechnology and Biological Sciences Research Council, Award: BB/H021868/1

Natural Environment Research Council

European Research Council, Award: EC 250098 WEG