Multi-population genome-wide association studies involving four distinct barley populations
Data files
Feb 12, 2025 version files 204.74 MB
-
HeadingDate_Phenotypes.txt
1.38 MB
-
Lodging_Phenotypes.txt
1.29 MB
-
RawGenotypes.csv
202.07 MB
-
README.md
1.59 KB
Abstract
The power of genome-wide association studies (GWAS) relies heavily on the sample size. A strategy to increase sample size is to combine datasets from different populations. However, this approach introduces challenges due to heterogeneity between populations. With this data, we set up a statistically sound model to account for such heterogeneity. Using this model, we combined up to four distinct barley populations in GWAS to detect genomic regions associated with heading date and stem lodging. Each population represented an applied breeding program with unique combinations of growth habit (winter versus spring) and row type (2-rowed versus 6-rowed).
By comparing single-population GWAS with multi-population GWAS, we identified both quantitative trait loci (QTLs) that were shared across populations and population-specific QTLs. We found that multi-population GWAS provided greater statistical power than single-population analyses, revealed QTLs that were undetectable in small populations, and explained an overall larger proportion of the phenotypic variance.
Our findings offer a promising approach to accelerate genomics-based breeding in new breeding populations with limited data. This methodology is applicable to a wide range of datasets where sample sizes are limited for various reasons.
https://doi.org/10.5061/dryad.n2z34tn6h
Description of the data and file structure
The data consists of genotypes of four barley breeding programs at Nordic Seed and their phenotypic records for heading date and lodging in multi-environmental field trials.
All populations were assessed for heading date and lodging. The 6RW population was scored in fewer environments (6–9) compared to other populations, which were scored in 9–18 year-location combinations. In the multi-population GWAS, combining the 6RW population with others resulted in 15 candidate QTLs for heading date and 9 for lodging. Heritability for heading date was generally higher than for lodging. Broad-sense entry-mean heritability ranged from 0.93 (6RW, 6RS) to 0.95 (2RS) for heading date, and from 0.52 (6RW) to 0.80 (2RS) for lodging. Missing data given as NA.
Files and variables
File: HeadingDate_Phenotypes.txt
Description: The file contains the observations for the heading date used in the genome-wide association studies presented in the paper.
File: Lodging_Phenotypes.txt
Description: The file contains the observations for lodging used in the genome-wide association studies presented in the paper.
File: RawGenotypes.csv
Description: The file contains the raw genotypes for all lines studied genetically in the paper. SNP markers are coded as -1/0/1. Markers have been shuffled in random order to protect data.