Skip to main content
Dryad

Genetic population structure constrains local adaptation in sticklebacks

Cite this dataset

Kemppainen, Petri (2020). Genetic population structure constrains local adaptation in sticklebacks [Dataset]. Dryad. https://doi.org/10.5061/dryad.76hdr7str

Abstract

Repeated and independent adaptation to specific environmental conditions from standing genetic variation is common. However, if genetic variation is limited, the evolution of similar locally adapted traits may be restricted to genetically different and potentially less optimal solutions or prevented from happening altogether. Using a quantitative trait locus (QTL) mapping approach, we identified the genomic regions responsible for the repeated pelvic reduction (PR) in three crosses between nine-spined stickleback populations expressing full and reduced pelvic structures. In one cross, PR mapped to linkage group 7 (LG7) containing the gene Pitx1, known to control pelvic reduction also in the three-spined stickleback. In the two other crosses, PR was polygenic and attributed to ten novel QTL, of which 90% were unique to specific crosses. When screening the genomes from 27 different populations for deletions in the Pitx1 regulatory element, these were only found in the population in which PR mapped to LG7, even though the morphological data indicated large effect QTL for PR in several other populations as well. Consistent with the available theory and simulations parameterised on empirical data, we hypothesise that the observed variability in genetic architecture of PR is due to heterogeneity in the spatial distribution of standing genetic variation caused by >2x stronger population structuring among freshwater populations and >10x stronger genetic isolation by distance in the sea in nine-spined sticklebacks as compared to three-spined sticklebacks.

Methods

Contains data sets from three F2-intercorsses of marine (females) crossed with freshwater males from thee different populations. This data was processed using Lep-MAP3, followed by Linkage Disequilibrium network analysis (LDna) and PCA based complexity reduction and analysed with a four-way single-mapping approach. R-code for LDna four-way single-mapping are available as custom-R code (based on LDna v.0.64). Contains also phenotypic data of spine and girdle lengths, as well as for standrard body lengths. Raw reads will be available from “Genetic population structure constrains local adaptation in sticklebacks": PRJNA673430 (release date: 2020-11-30).

Contains BAM files or reads frrom 27 nine-spined stickleback populations spanningn the Pitx1/pel deletion (LG7:17015109-17018744), processed minimap2, samptools and custom R code.

Contains processed neutral population genomic data (4,326 SNPs; BWA mem, Samtools, GATK and VCFtools) from nine- and three-spined stick from 25 and 31 populations, respectively. Raw reads for this data (SRA accesion number: PRJNA672863) comprise a subset of a larger unpublished comparative stickleback study and is thus place under embargo until until that study is published.

 

 

Usage notes

QTL mapping, PVE and trait correlations
Data and analyses for QTL mapping (and codes for figures and phenotypic correlation analyses) can be found in folder "QTL"

Pel-mapping
Pipelaines/code/data for pel-mapping can be found in folder "Pel"

Simulations
The Run_sim_0_9sp.R and Run_sim_0_3sp.R (in folder "Simulations/sim_res") can be run in R (preferably from the command line) and all the output will be produced into the folders "3sp" and "9sp". "copy_files.R" is used to parse output from these files. Please note that the full simulation output is >50gb (that is produced by quantinemo into folder: "sim_res") so they are not provided. However, all the necessary files used by the analyses can be found in the folders "sim_extract_freq" and "sim_extracted_data".

Note that due to copyright reasons, the program "quantinemo" (which needs to be in the "Simulations/sim_res" folder for the code to work) needs to be put there manually.

Individual accession numbers for the population genomic data can be found in file "Objects.txt" in the "Simulations" folder.

All other code and temporary data files are in folder "Simulations".