Linking genomic offset statistics to the shape of selection gradients
Data files
Oct 13, 2025 version files 688.07 MB
-
data_millet.zip
9.90 MB
-
data_poplar.zip
412.67 KB
-
data_redspruce.zip
90.38 MB
-
GO-selection-gradient_S3.Rmd
29.32 KB
-
GO-selection-gradient_S4.Rmd
29.32 KB
-
README.md
8.70 KB
-
scripts_slim.zip
12.21 KB
-
sims_slim.zip
587.29 MB
Abstract
Genomic offset metrics are increasingly used to predict population maladaptation under changing climates, based on the assumption of a negative statistical relationship between offset measures and local relative fitness. Recent theoretical advances have confirmed this relationship by relating genomic offset to phenotypic trait distances along selection gradients. However, these metrics typically rely on the assumption that stabilizing selection, which maintains local adaptive optima, operates on fitness-related traits through Gaussian-shaped selection gradients. In this study, we extend the theory to accommodate more diverse forms of selection gradients and introduce more general genomic offset measures that preserve the fitness-offset relationship. We validate this generalization through simulations and demonstrate the utility of these new measures in predicting relative fitness in common garden experiments involving three plant species: pearl millet, a vital staple cereal grown in arid soils, and two emblematic North American tree species, balsam poplar and red spruce. Our findings indicate that assuming a local Gaussian-shaped selection gradient for climate adaptation is a robust approximation for these species. These results have important implications for validating genomic offset predictions using fitness proxies and for studies that aim to predict fitness loss based on genomic offset metrics.
This README.txt file was updated on May 12th 2025 by Thibaut Capblancq
Paper associated with this archive
This repository contains the data necessary to reproduce the results of the article "Linking genomic offset statistics to the shape of selection gradients", written by Thibaut Capblancq, Aurlien Tauzin, Yves Vigouroux, Philippe Cubry and Olivier Franois and published in 2025 in the journalThe American Naturalist for a special feature on Genomic Forecasting.
Originators
Thibaut CAPBLANCQ (1), Aurélien TAUZIN (1), Yves VIGOUROUX (2), Philippe CUBRY (2) and Olivier FRANÇOIS (1)
(1) TIMC, Centre National de la Recherche Scientifique, Université Grenoble-Alpes, Grenoble INP, Grenoble, France
(2) DIADE, Université de Montpellier, Institut de Recherche pour le Développement, Montpellier, France
Contact information
Capblancq Thibaut: thibaut.capblancq@univ-grenoble-alpes.fr
Olivier FRANÇOIS: olivier.francois@univ-grenoble-alpes.fr
Funding Sources
This project was supported by the French National Research Agency (ANR-22-CE45-0033 and ANR-22-CE32-0008)
ACCESS INFORMATION
Licenses/restrictions placed on the data or code
CC0 1.0 Universal (CC0 1.0) Public Domain Dedication
Data derived from other sources
Empirical data were retrieved from the following sources:
- Gain et al., (2023). https://doi.org/10.1093/molbev/msad140
- Rhone et al., (2020). https://doi.org/10.1038/s41467-020-19066-4
- Fitzpatrick et al., (2021). https://doi.org/10.1111/1755-0998.13374
- Capblancq et al., (2023). https://doi.org/10.1111/nph.18465
Recommended citation for this data/code archive
Data from: Linking genomic offset statistics to the shape of selection gradients. DOI: 10.5061/dryad.jdfn2z3m6
DATA & CODE FILE OVERVIEW
This data repository consists of 1015 data files, 6 code scripts, and this README document, with the following data and code filenames and variables
Data files and variables
A zipped folder containing genomic, climatic and phenotypic data for 41 sampled localities of balsam poplar:
- data_poplar_accessions.csv: A list of names for the 41 accessions
- data_poplar_climate.csv: A table of five climate predictors (bio2: Mean Temperature Diurnal Range in celsius, bio10: Mean Temperature of Warmest Quarter in celsius, bio11: Mean Temperature of Coldest Quarter in celsius, bio18: Precipitation of Warmest Quarter in mm and bio19: Precipitation of Coldest Quarter in mm) for the source locality of each accession.
- data_poplar_climate.cg.csv: A table of five climate predictors (bio2: Mean Temperature Diurnal Range in celsius, bio10: Mean Temperature of Warmest Quarter in celsius, bio11: Mean Temperature of Coldest Quarter in celsius, bio18: Precipitation of Warmest Quarter in mm and bio19: Precipitation of Coldest Quarter in mm) for a common garden site in Vermont (USA).
- data_poplar_trait.csv: Height increment data for each accession in centimetres, measured in a common garden in Vermont (USA).
- data_poplar_genotype.csv: genotypes for 9,100 loci.
These data have been extracted from Fitzpatrick et al. (2021), https://doi.org/10.1111/1755-0998.13374.
A zipped folder containing genomic, climatic and phenotypic data for 63 sampled localities of balsam poplar:
- data_redspruce_accessions.csv: A list of names for the 63 accessions
- data_redspruce_climate.csv: A table of eleven climate predictors (DD_0: degree-days below 0°C, DD18: degree-days above 18°C, MAR: mean annual solar radiation (MJ m‐2 d‐1), PAS: precipitation as snow (mm) between August in previous year and July in current year, MSP: May to September precipitation (mm), RH: mean annual relative humidity (%), EXT: extreme maximum temperature over 30 years, CMD: Hargreaves climatic moisture deficit (mm), TD: temperature difference between MWMT and MCMT, or continentality (°C), eFFP: the day of the year on which FFP ends and PET: Potential evapotranspiration) for the source locality of each accession.
- data_redspruce_climate.cg.csv: A table of eleven climate predictors (DD_0: degree-days below 0°C, DD18: degree-days above 18°C, MAR: mean annual solar radiation (MJ m‐2 d‐1), PAS: precipitation as snow (mm) between August in previous year and July in current year, MSP: May to September precipitation (mm), RH: mean annual relative humidity (%), EXT: extreme maximum temperature over 30 years, CMD: Hargreaves climatic moisture deficit (mm), TD: temperature difference between MWMT and MCMT, or continentality (°C), eFFP: the day of the year on which FFP ends and PET: Potential evapotranspiration) for a common garden site in Burlington (USA).
- data_redspruce_trait.csv: Height increment data for each accession in centimetres, measured in a common garden in Burlington (USA).
- data_redspruce_genotype.csv: genotypes for 176,716 loci.
These data have been extracted from Capblancq et al.(2023), https://doi.org/10.1111/nph.18465.
A zipped folder containing genomic and climatic and phenotypic data for 154 sampled localities of pearl millet.
- data_millet_accessions.csv: A list of names for the 154 accessions
- data_millet_climate.csv: A table of five climate predictors (PREC1: first axis of a pca with only precipitation variables, TEMP1: first axis of a pca with only temperature variables, TEMP2: second axis of a pca with only temperature variables, TEMP8: eigth axis of a pca with only temperature variables and TEMP14: fourteenth axis of a pca with only temperature variables, see Gain et al., (2023). https://doi.org/10.1093/molbev/msad140 for details) for the source locality of each accession.
- data_millet_climate.cg.csv: A table of five climate predictors (PREC1: first axis of a pca with only precipitation variables, TEMP1: first axis of a pca with only temperature variables, TEMP2: second axis of a pca with only temperature variables, TEMP8: eigth axis of a pca with only temperature variables and TEMP14: fourteenth axis of a pca with only temperature variables, see Gain et al., (2023). https://doi.org/10.1093/molbev/msad140 for details) for a common garden site in Sadoré (Niger).
- data_millet_trait.csv: Total seed weight data for each accession in grams, measured in a common garden in Sadoré (Niger).
- data_millet_genotype.csv: genotypes for 16,632 loci.
These data have been extracted from Rhone et al. (2020), https://doi.org/10.1038/s41467-020-19066-4.
A zipped folder containing 100 simulations used to confirm the validity of the alpha-GO statistic and its relationship with the geometric genomic offset. 10 files are associated with each simulation including:
- environmental values for the two causal variables var1 and var2 before (var1_step1and* *var2_step1) and after (var1_pred1) a brutal environmental change,
- the position of each sampled individual on the simulated grid (position_ind_step1)
- individuals' fitness before (fitness_step1) and after (fitness_pred1) the environmental change,
- individuals' trait value for the two simulated traits (trait1 and trait2)
- individuals' genotypes (genome_step1.vcf)
- positions of the 120 causal mutations associated with QTL1 (mutationm2_step1.txt) and QTL2 (mutationm3_step1.txt)
Code scripts and workflow
A zipped folder containing the scripts used to produce the simulations with four different alpha values with SLiM (version 3.7).
R markdown script file describing all the analyses conducted on the simulated datasets
R markdown script file describing all the analyses conducted on the empirical datasets
SOFTWARE VERSIONS
The only software required to reproduce our results is the software R (version 4.3.1) and the R-packages LEA (version 3.19), reshape2 (version 1.4.4), ggplot2 (version 3.5.1) and ggpubr (version 0.6.0). If one wants to reproduce the simulations, the software SLiM (version 3.7) will be needed as well.
The ⍺-GO statistics and their relationship with the geometric GO were first tested using simulations. Spatially explicit individual-based simulations were performed using SLiM 3.7 (Haller and Messer 2019), as described for the highly polygenic scenario in Gain and colleagues (2023). Briefly, each individual genome contained both neutral and adaptive mutations, the latter ones being under local stabilizing selection from a 2D environment (12x12 grid) with two orthogonal environmental gradients, (x1, x2). Two traits, (z1, z2), controlled by 120 adaptive mutations with additive effects, were matched to each causal environmental variable by local stabilizing selection. The probability of survival of an individual genome in the next generation was calculated as the product of density regulation and fitness. The density of individuals was regulated by spatial competition depending on the number of individuals within a circle with radius S = 0.8 (Haller and Messer 2019). Individual fitness was calculated according to the following formula:
ω(z1, z2 | x) = exp( –1/2 * ((|z1 – x1| /σK)⍺ + (|z2 – x2|/σK)⍺ )))
where x1 and x2 are the local environmental values corresponding to the optimal trait values in that local environment, z1 and z2 are the individual trait values estimated from the 120 adaptive mutations, and σK is a selection coefficient. The parameter ⍺ depends on the shape of the stabilizing selection gradient considered, with ⍺ = 0.5 giving a strict exponential shape, ⍺ = 1 a Laplacian shape, ⍺ = 2 a Gaussian shape and ⍺ = 3 giving a more tolerant shape (Figure 1). For each scenario, 100 replicates were run with different seed values of the random generator for 2,000 generations before an instantaneous environmental change. At the end of a simulation, individual geographic coordinates, environmental variables and individual fitness values before and after the instantaneous environmental change were recorded.
Gain, Clément, Bénédicte Rhoné, Philippe Cubry, Israfel Salazar, Florence Forbes, Yves Vigouroux, Flora Jay, and Olivier François. 2023. “A Quantitative Theory for Genomic Offset Statistics.” Molecular Biology and Evolution 40 (6): msad140.
Haller, Benjamin C, and Philipp W Messer. 2019. “SLiM 3: Forward Genetic Simulations Beyond the Wright–Fisher Model.” Edited by Ryan Hernandez. Molecular Biology and Evolution 36 (3): 632–37. https://doi.org/10.1093/molbev/msy228.
