# Data and code from: Predicting population genetic change in an autocorrelated random environment: insights from a large automated experiment

## Cite this dataset

Rescan, Marie et al. (2021). Data and code from: Predicting population genetic change in an autocorrelated random environment: insights from a large automated experiment [Dataset]. Dryad. https://doi.org/10.5061/dryad.m0cfxpp3z

## Abstract

Most natural environments exhibit a substantial component of random variation, with a degree of temporal autocorrelation that defines the color of environmental noise. Such environmental fluctuations cause random fluctuations in natural selection, affecting the predictability of evolution. But despite long-standing theoretical interest for population genetics in stochastic environments, there is a dearth of empirical estimation of underlying parameters of this theory. More importantly, it is still an open question whether evolution in fluctuating environments can be predicted indirectly using simpler measures, which combine environmental time series with population estimates in constant environments. Here we address these questions by resorting to an automated experimental evolution approach. We used a liquid-handling robot to expose over a hundred lines of the micro-alga *Dunaliella salina *to randomly fluctuating salinity over a continuous range, with controlled mean, variance, and autocorrelation. We then tracked the frequencies of two competing strains through amplicon sequencing of a nuclear and choloroplastic barcode sequences. We show that the magnitude of environmental fluctuations (variance), but also their predictability (autocorrelation), had large impacts on the average selection coefficient. The stochastic variance in frequency change, which quantifies randomness in population genetics, was substantially higher in a fluctuating environment. The reaction norm of selection coefficients against constant salinity yielded accurate predictions for the mean selection coefficient in a fluctuating environment. This selection reaction norm was in turn well predicted by environmental tolerance curves, with population growth rate against salinity. However, both the selection reaction norm and tolerance curves underestimated the variance in selection caused by random environmental fluctuations. Overall, our results provide exceptional insights into the prospects for understanding and predicting genetic evolution in randomly fluctuating environments.

## Methods

We exposed a mixture of two *Dunaliella salina* strains to fluctuating vs constant salinity (strains CCAP 19/12 : A and CCAP 19/15 : C), and tracked their frequencies through time by amplicon sequencing of the ITS2 and a chloroplast locus. Genomic DNA from 1071 samples was extracted using the Nucleospin® plant II (Macherey-Nagel).

In order to make efficient use of the sequencing data, we reduced all chloroplast and ITS2 reads to short haplotypes made of a succession of few linked SNPs that individually maximized the *F _{ST}* among pure, reference cultures of A and C. We estimated fluctuating selection by tracking the dynamics of the frequency

*p*of strain C through time, by combining two sources of genetic information, from the ITS2 and the chloroplast locus. We considered that the frequencies measured at the ITS2 and chloroplast loci were two observations (with error) of a true, unobserved strain frequency

*p*. This corresponds to a state-space model, and we wrote its explicit likelihood function in C++, and optimized it using the TMB package in R (v.3.5.2)

## Usage notes

Users can perform the analysis by running the R script (frequency_dynamics_analysis.R) after the installation of all the

packages mentioned in "Packages and functions".

This folder contains:

- The sequences dataset divided into the following directories:

Reference (ITS2 and Chloro), Calibration (ITS2 and Chloro), ITS2 and Chloro.

They must be extracted from .zip before use.

"Reference" and "Calibration" folders contain the sequences from pure and mix populations of the back-up A and C strains

used in the experiment. "Chloro" and "ITS2" contain the sequences from the evolutionary experiments.

Each file corresponds to one population measured at a specific time. It is obtained from a fasta file

(after quality check) and gives the total number of copies per individual sequences.

- one data set giving the salinities used in the stochastic treatments, and the associated growth rates estimated in a previous paper (Rescan et al. 2020)

estimationR_gamma.csv

- one data set giving the salinities used in the constant treatments, and the associated growth rates estimated in a previous paper (Rescan et al. 2020)

estimationR_DD.csv

- 3 C++ scripts that must be compiled and run with the R TMB package:

* logistic_regression.cpp : computes the negative loglikelihood of the allele C frequency dynamics.

The fixed slope gives the mean selection coefficient, as a function of environmental mean, variance,

autocorrelation and predictability (autocorrelation²). The random slope gives the selection variance

and is fitted as a function of the environmental variance treated as a factor: 0 or 1.

The code returns estimations for all fixed parameters (fixed term) and the estimated allele C frequency

in each population and at each time point (random terms).

* regression_all_line.cpp : computes the negative loglikelihood for the dynamics of each line in constant

salinities. The mean selection coefficient depends on each line and its variance (random term) is

generated by micro environmental variations or demographic stochasticity.

* regression_salinity.cpp : computes the negative loglikelihood for the regression between frequency changes

and salinity (current and past), initial frequency and density during one transfer.

- 1 R script:

frequency_dynamics_analysis.R: Builds the data table including all ITS2 and chloroplast sequences.

Analyses the allele C dynamics. Estimates selection mean and variance in constant and stochastic

autocorrelated environments. Analyses the selection response to previous and current salinity.

Builds the paper Figures.

## Funding

European Research Council, Award: STG-678140-FluctEvol