Data from: Inferring the strength of directional selection on armor plates in Lake Washington stickleback while accounting for migration and drift
Data files
Jan 16, 2026 version files 131.25 KB
-
1_LakeWA_morphology_genotypes.zip
30.92 KB
-
2_Fastsimcoal2.zip
76.66 KB
-
3_FowardSimulation.zip
10.97 KB
-
4_ALAN.zip
6.42 KB
-
README.md
6.28 KB
Abstract
Contemporary evolution allows us to investigate how natural selection drives phenotypic and genotypic evolution in nature. Recent advances in molecular genetics have identified causative genes underlying adaptive traits, enabling estimation of selection coefficients at these loci. However, estimating selection is challenging when populations receive migrants from genetically and phenotypically distinct populations. With genome-wide data now allowing estimation of migration rates and effective population sizes, these demographic parameters can be integrated into models for measuring selection. In Lake Washington, USA, the frequency of the completely plated morph of the threespine stickleback (Gasterosteus aculeatus) increased from 1957 to 2005, plausibly due to increased trout predation pressure caused by enhanced water clarity. Here, we estimated the selection coefficient at a major locus responsible for the plate morph using historical data, taking migration and genetic drift into consideration. Model-based predictions of present allele frequencies were tested with samples collected in 2022. Consistent with directional selection, the completely plated morphs and the underlying allele have increased since 2005, but to higher frequencies than predicted, suggesting a recent increase in selection. Thus, integrating molecular genetics, population genomics, and simulations enables the estimation of selection strength while considering migration and drift, to reveal directional selection in nature.
In Lake Washington, USA, the frequency of the completely plated morph of the threespine stickleback (Gasterosteus aculeatus) increased from 1957 to 2005. Here, we sampled the stickleback in 2022 and investigated the temporal change in plate evolution together with previously published 2016 data.
General information
- Title of Dataset: Inferring the strength of directional selection on armor plates in Lake Washington stickleback while accounting for migration and drift
- Author Information
- Principal Investigator
Name: Jun Kitano
Institution: National Institute of Genetics
Address: Mishima, Shizuoka, Japan
Email: jkitano@nig.ac.jp - Data curation
Name: Yo Yamasaki
Institution: National Institute of Genetics
Address: Mishima, Shizuoka, Japan
Email: yamasaki@nig.ac.jp
- Date of data collection: 1957-2022
- Geographic location of data collection: Lake Washington, WA, USA.
- Information about funding sources that supported the collection of the data
- NIG Summer Internship Program
- JSPS Kakenhi (22H04983, 20J01503, 21H02542, and 22KK0105)
- JST CREST (JPMJCR20S2)
Description of the data and file structure
There are four categories of datasets. Details of data collection methods are described in the manuscripts.
1. Morphological and genotype data collected from Lake Washington sticklebacks.
Files in 1_LakeWA_morphology_genotypes.zip
-
1_1_1_LakeWA_platenumber_distribution_1957_2022.tsv
Number of individuals for each plate number for each year. Data during 1957 to 1976 were obtained from published literatures. The detailed number of individuals with more than 12 plates in 1968/1969 is unknown. -
1_1_2_LakeWA_platemorph_distribution_1957_2022.tsv
Number of individuals for each plate morph for each year. Data during 1957 to 1976 were obtained from published literatures.
-
1_2_lakeWA_Callele_freq.tsv
Allele frequency changes of the complete Eda allele frequency from 2005 to 2022. Stn382 marker was used for genotyping. Data of 2005 and 2016 were from Kitano et al. 2008 and Archambeault et al. 2020 respectively. -
1_3_1_LakeWA_platenumber_genotypes_SL_raw_2005_2016_2022.tsv
Plate number, plate morph, standard length (SL), and genotypes at the Eda locus for each individual collected in 2005, 2016, and 2022. Loci other than Stn382 were not genotyped in 2005. -
1_3_2_LakeWA_genotype_2005_2016_2022_trimmed.tsv
Plate number and genotypes for each individual. This file is generated from the 1_3_1 file by removing individuals that showed NA in plate number or genotype in 2016 and 2022. -
1_3_3_LakeWA_SL_2005_2016_2022_trimmed.tsv
Plate number and SL for each individual. This file is generated from the 1_3_1 file by removing individuals that showed NA in plate number or SL.
Note
- Missing data is coded as NA.
- Mean of columns is as follows;
- Year: Collected year
- No_samples: Observed sample numbers
- LPN: Left Plate Number
- C_allele_freq: Allele frequency of Complete allele in locus Stn382
- ID: Sample identifier
- Sex: Code of sex. F = Female, M = Male.
- Plate_Morph: Code of classified plate morph. C = Complete, P = Partial, L = Low.
- SL: Standard Length
- Stn382: Genotypes of genetic marker Stn382. C = Complete, L = Low.
- LP3621: Genotypes of genetic marker LP3621. C = Complete, L = Low.
- Cnv770: Genotypes of genetic marker Cnv770. C = Complete, L = Low.
- Stn381: Genotypes of genetic marker Stn381. C = Complete, L = Low.
- LP13173: Genotypes of genetic marker LP13173. C = Complete, L = Low.
- ln_SL: Natural logarithm transformed standard length
2. Files used for Fastsimcoal2.
Files in 2_Fastsimcoal2.zip
Three tested models were included. Contents of three files in each model is as follows.
- .obs: Observed allele frequency spectrum.
- .tpl: Structure of demographic model.
- .est: Explored parameter range.
3. Scripts used for SLiM forward simulations and ABC estimation
Files in 3_FowardSimulation.zip
- 3_1_LW_monomorphic.slim
In this script, we assumed that the complete Eda allele is fixed in the Puget Sound marine population and low Eda allele is fixed in the Kelsey Creek freshwater population. - 3_2_LW_polymorphic.slim
In this script, we used observed the complete Eda allele frequency in both the Puget Sound marine population and the Kelsey Creek freshwater population. - 3_3_LW_SLiM_iteration.ipynb
This script was written for estimating selection coefficient and dominance coefficient by ABC and plotting the results.
4. Data of artificial light at night (ALAN)
Files in 4_ALAN.zip
- 4_1_VIIRS_annual_reflectance_apr_sep.csv
Light reflectance at night in each location collected in summer season (April to September) from 2016 to 2022. - 4_2_VIIRS_annual_reflectance_oct_mar.csv
Light reflectance at night in each location collected in winter season (October to March) from 2016 to 2022.
Note
- Unit is "nW cm ⁻² sr ⁻¹"
Sharing/Access information
- Licenses/restrictions placed on the data: CC0 1.0 Universal (CC0 1.0) Public Domain
- Links to publications that cite or use the data:
Yamasaki, Y.Y., Yamaguchi, R., Nagano, A.J., Chen, B.-J., Musto, N., Archambeault, S., Peichel, C.L., Schulien, J.A., Code, T.J., Beauchamp, D.A., Kitano, J.
Evolving Armor Plates in the Lake Washington Threespine Stickleback
Links to other publicly accessible locations of the data:
- Data from: Adaptation via pleiotropy and linkage: association mapping reveals a complex genetic architecture within the stickleback Eda locus
- Original source of 2016 data.
Data was derived from the following sources:
- Adaptation via pleiotropy and linkage: Association mapping reveals a complex genetic architecture within the stickleback Eda locus
- Reverse Evolution of Armor Plates in the Threespine Stickleback
Code/Software
Softwares and Versions
- Fastsimcoal2: 2.6
- SLiM: 3.3
