Variable parallelism in the genomic basis of age at maturity across spatial scales in Atlantic Salmon
Data files
Jun 18, 2025 version files 19.07 MB
-
README.md
3.24 KB
-
Salmo_220K_merge2022_grilse.bed
16.10 MB
-
Salmo_220K_merge2022_grilse.bim
2.89 MB
-
Salmo_220K_merge2022_grilse.fam
20.17 KB
-
salmo_220K_site_group_meta_2022_NA_grilse_DFOregions.csv
1.55 KB
-
Salmon_Metadata_Bam_Unique_HD.txt
65.10 KB
Abstract
Complex traits often exhibit complex underlying architectures by evolution from standing variation, hard and soft sweeps, and alleles of varying effect size. Increasingly, studies implicate large-effect loci and polygenic patterns underpinning adaptation, but the extent that common genetic architectures are utilized during adaptation is not understood. Sea age or age at maturation represents a significant life history trait in Atlantic Salmon (Salmo salar), studied extensively in European Atlantic populations, with repeated identification of large-effect loci. However, the genetic basis of sea age within North American populations remains unclear, as does potential for a parallel trans-Atlantic genomic basis to sea age. Here, we used a large SNP array and whole genome re-sequencing to explore the genomic basis of sea age in North American Atlantic Salmon. We found significant associations at the gene and SNP level with a known large-effect locus (vgll3), indicating genetic parallelism, but found that this pattern varied based by sex and location. We identified non-repeated highly predictive loci associated with sea age among populations and sexes within North America, indicating polygenicity and low parallelism. Despite low parallelism, we uncovered conserved molecular pathways associated with sea age that were consistently enriched among comparisons, including calcium signalling, MapK signalling, focal adhesion, and phosphatidylinositol signalling. Together, our results indicate parallelism of the molecular basis of sea age in North American Atlantic Salmon across large-effect genes and molecular pathways, despite polygenicity. These findings reveal roles for both contingency and repeated adaptation at the molecular level in the evolution of life history variation.
https://doi.org/10.5061/dryad.g1jwstqz3
This repo contains metadata used in a genomic study of Atlantic Salmon at-sea maturation strategies across North America. The metadata here is as follows:
Salmon_Metadata_Bam_Unique_HD.txt - Individual sex, sea age, and river of origin information for 582 individuals sequenced with whole genome resequencing. Paired reads are available at NCBI with accession PRJNA1083490. Variables are:
- BAM - A BAM (Binary Alignment Map) file containing sequenced genomic data, typically used for storing aligned sequence reads from high-throughput sequencing.
- FGL_ID - Fisheries Genetics Lab identifier, a unique code assigned by the fisheries genetics laboratory for specimen tracking and database management.
- RiverCode - A standardized code representing the river or waterway where the salmon sample was collected, used for geographic classification and location tracking.
- SiteName - The specific name or description of the sampling location within the river system, providing detailed geographic context for where the salmon was captured.
- AgeClass - The sea age classification of the salmon, indicating how many years the fish spent in marine waters before returning to freshwater (e.g., 1SW for one sea-winter, MSW for multi sea-winters).
- PCR_Sex - The sex of the salmon specimen determined through PCR amplification of the sdY (sexually dimorphic on the Y chromosome) marker, a genetic method for accurate sex determination in salmonids.
Salmo_220K_merge2022_grilse.fam, .bed, .bim - Plink Individual IDs (fam), binary genotypes (bed), and marker information (bim)
salmo_220K_site_group_meta_2022_NA_grilse_DFOregions.csv - River name, code, location with latitude and longitude, and proportion one sea winter (grilse) fish returning per river, used as phenotype in population-level GEA. Variables are:
- SiteCode - A standardized alphanumeric code assigned to each sampling location, used as a unique identifier for database management and cross-referencing.
- Site - The descriptive name or full designation of the sampling location, providing human-readable identification of where specimens were collected.
- Group - Continent of origin identifier used for salmon samples analyzed with the 220K SNP (Single Nucleotide Polymorphism) array, categorizing specimens by their continental source population.
- Lat - Latitude coordinate of the sampling site in decimal degrees, providing precise geographic positioning for mapping and spatial analysis.
- Lon - Longitude coordinate of the sampling site in decimal degrees, providing precise geographic positioning for mapping and spatial analysis.
- p1SW - The proportion or percentage of one sea-winter (1SW) salmon at this site, indicating the fraction of the salmon population that spent one year in marine waters before returning.
- Region - Regional classification as specified by DFO (Department of Fisheries and Oceans Canada), providing standardized geographic or management unit designations for Canadian fisheries administration.
Atlantic Salmon Axiom SNP array data in plink format (.bed, .bim, .fam), and individually matched collected sea age phenotypes for population level and individual level trait association analyses.
- Kess, Tony; Lehnert, Sarah J.; Bentzen, Paul et al. (2024). Variable parallelism in the genomic basis of age at maturity across spatial scales in Atlantic Salmon. Ecology and Evolution. https://doi.org/10.1002/ece3.11068
