Skip to main content
Dryad

Secondary evolve and re-sequencing: an experimental confirmation of putative selection targets without phenotyping

Cite this dataset

Burny, Claire et al. (2020). Secondary evolve and re-sequencing: an experimental confirmation of putative selection targets without phenotyping [Dataset]. Dryad. https://doi.org/10.5061/dryad.mkkwh70vs

Abstract

Evolve and re-sequencing (E&R) studies investigate the genomic responses of adaptation during experimental evolution. Because replicate populations evolve in the same controlled environment, consistent responses to selection across replicates are frequently used to identify reliable candidate regions that underlie adaptation to a new environment. However, recent work demonstrated that selection signatures can be restricted to one or a few replicate(s) only. These selection signatures frequently have a weak statistical support, and given the difficulties of functional validation, additional evidence is needed before considering them as candidates for functional analysis. Here, we introduce an experimental procedure to validate candidate loci with weak or replicate-specific selection signature(s). Crossing an evolved population from a primary E&R experiment to the ancestral founder population reduces the frequency of candidate alleles that have reached a high frequency. We hypothesize that genuine selection targets will experience a repeatable frequency increase after the mixing with the ancestral founders if they are exposed to the same environment (secondary E&R experiment). Using this approach, we successfully validate two overlapping selection targets, which showed a mutually exclusive selection signature in a primary E&R experiment of Drosophila simulans adapting to a novel temperature regime. We conclude that secondary E&R experiments provide a reliable confirmation of selection signatures that are either not replicated or show only a low statistical significance in a primary E&R experiment. Such experiments are particularly helpful to prioritize candidate loci for time-consuming functional follow-up investigations.

Usage notes

3R arm datasets from Novoalign mapper. In the following datasets, the column headers are as follow: "F" denotes the generation, "R" denotes the replicate (x,y,z in the primary E&R and x.1,x.2 and z.1,z.2,z.3 in the secondary E&R where replicates x and z have been diluted). "ER" suffix indicates the primary E&R and "dil" suffix refers to the secondary E&R. The suffix "simu" indicates neutrally simulated data. The prefix "s" indicates selection coefficient values. The polarization, indicated by the suffix "_polarized_F0_F70_xyz", and is done here on the rising allele in the first E&R over the corresponding replicates, x, y and z. Some examples below:

"rising_polarized_F0_F70_z_ER": indicates the ID of the rising allele in the first E&R over the corresponding replicate z.          

"F70.Rz.cov_ER": coverage for replicate z at F70 in the primary E&R                            

"F0.Ry.freq_polarized_F0_F70_yz_ER": frequency for replicate y at F0 in the primary E&R, polarized on the rising allele in replicates y and z        

"pval_CMH_IHW_polarized_F0_F70_xyz_ER": p-value of the CMH test over replicates x, y and z with IHW correction in the primary E&R, polarized on the rising allele in replicates x, y and z, from empirical data

"pval_CMH_IHW_polarized_F0_F70_xyz.simu_ER": p-value of the CMH test over replicates x, y and z with IHW correction in the primary E&R, polarized on the rising allele in replicates x, y and z, from neutrally simulated data

"FDR_CMH_IHW_polarized_F0_F70_xyz_ER": corresponding FDR of the CMH test over replicates x, y and z with IHW correction in the primary E&R, polarized on the rising allele in replicates x, y and z, from neutrally simulated data      

"s_Rz_polarized_F0_F70_z_ER": selection coefficient s in replicate z obtained after polarization on the rising allele in the first E&R in replicate z.                

"F70.Rz.freq_polarized_F0_F70_z.simu": frequency for replicate z at F70 in the primary E&R, polarized on the rising allele in replicates z,  from neutrally simulated data.   

Note that F0 for the simulated data is equal to the observed F0 (see SI). One row is one marker SNP.

sync file for the primary E&R
#column labels: chr - pos - ref - F0 ER Ry - F0 ER Rx - F0 ER Rz - F70 ER Ry - F70 ER Rx - F70 ER Rz
primary_ER.sync.zip

sync file for the secondary E&R
#column labels: chr - pos - ref - F0 dil Rx1 - F0 dil Rx2 - F0 dil Rz1 - F0 dil Rz2 - F0 dil Rz3 - F30 dil Rx1 - F30 dil Rx2 - F30 dil Rz1 - F30 dil Rz2 - F30 dil Rz3
secondary_ER.sync.zip

txt file for the primary E&R: allele frequency, coverage, raw / adjusted p-values, selection coefficients, ref / rising allele in the corresponding replicate

primary_ER.txt.zip

txt file for the secondary E&R: allele frequency, coverage, raw / adjusted p-values, selection coefficients, ref / rising allele in the corresponding replicate

secondary_ER.txt.zip