A method for identifying environmental stimuli and genes responsible for genotype-by-environment interactions from a large-scale multi-environment data set

Published Dec 22, 2021 on Dryad. https://doi.org/10.5061/dryad.rr4xgxd6r

Data files

Dec 22, 2021 version files 461.89 MB

Abstract

It has not been fully understood in real fields what environment stimuli cause the genotype-by-environment (G × E) interactions, when they occur, and what genes react to them. Large-scale multi-environment data sets are attractive data sources for these purposes because they potentially experienced various environmental conditions. In this study, we developed a data-driven approach termed Environmental Covariate Search Affecting Genetic Correlations (ECGC) to identify environmental stimuli and genes responsible for the G × E interactions from large-scale multi-environment data sets. ECGC was applied to a soybean (Glycine max) data set that consisted of 25,158 records collected at 52 environments. ECGC illustrated what meteorological factors shaped the G × E interactions in six traits including yield, flowering time, and protein content and when they were involved. For example, it illustrated the relevance of precipitation around sowing dates and hours of sunshine just before maturity to the interactions observed for yield. Moreover, genome-wide association mapping on the sensitivities to the identified stimuli discovered candidate and known genes responsible for the G × E interactions. Our results demonstrate the capability of data-driven approaches to bring novel insights on the G × E interactions observed in fields. This dataset provides the data used in this study and supplementary tables cited in the manuscript.

A method for identifying environmental stimuli and genes responsible for genotype-by-environment interactions from a large-scale multi-environment data set

Data files

Abstract

Usage notes

Works referencing this dataset