Data from: Genetic basis of adult migration timing in anadromous steelhead discovered through multivariate association testing
Hess, Jon E., Columbia River Inter-Tribal Fish Commission
Zendt, Joseph S., Columbia River Inter-Tribal Fish Commission
Matala, Amanda R., Columbia River Inter-Tribal Fish Commission
Narum, Shawn R., Columbia River Inter-Tribal Fish Commission
Published Apr 15, 2016 on Dryad.
Cite this dataset
Hess, Jon E.; Zendt, Joseph S.; Matala, Amanda R.; Narum, Shawn R. (2016). Data from: Genetic basis of adult migration timing in anadromous steelhead discovered through multivariate association testing [Dataset]. Dryad. https://doi.org/10.5061/dryad.62q6n
Migration traits are presumed to be complex and to involve interaction among multiple genes, thus we employed both univariate analyses and multivariate Random Forest (RF) machine learning algorithm to conduct association mapping of 15,239 single nucleotide polymorphisms (SNPs) for adult migration-timing phenotype in steelhead (Oncorhynchus mykiss). Our study focused on a model natural population of steelhead that exhibits two distinct migration-timing life histories with high levels of admixture in nature. Neutral divergence was limited between fish exhibiting summer- and winter-run migration owing to high levels of interbreeding, but a univariate mixed linear model found three SNPs from a major effect gene to be significantly associated with migration-timing (p < 0.000005) that explained 46% of trait variation. Alignment to the annotated S. salar genome provided evidence that all three SNPs localize within a 46 kb region overlapping GREB1-like (an estrogen target gene) on chromosome Ssa03. Additionally, multivariate analyses with RF identified that these 3 SNPs plus 15 additional SNPs explained up to 60% of trait variation. These candidate SNPs may provide the ability to predict adult migration-timing of steelhead to facilitate conservation management of this species and this study demonstrates the benefit of multivariate analyses for association studies.
The input file of genotypes for the program TASSLE which was used to perform the univariate GWAS for this study.
This file contains the population structure individual Q values that were used as trait covariates in the program TASSLE to perform a univariate GWAS. Included are the individual proportions of ancestry using a K=6 and a K=10. For the analyses reported in the study, we used K=10, but we mentioned examining results with K=6 in the supplemental methods.
This file contains additional traits and covariates that were used to perform a univariate GWAS in TASSLE. Covariates include collection "year", and gender (1=male, 2=female). The trait phenotype is listed as "Dayreorder" which represents migration-timing in units of ordinal day as the phenotype. We "re-ordered" these days to reflect the biological sequence of annual steelhead migrations. Refer to the methods in the study for more detail.