Skip to main content
Dryad

Data from: An equation to predict the accuracy of genomic values by combining data from multiple traits, populations, or environments

Cite this dataset

Wientjes, Yvonne C. J.; Bijma, Piter; Veerkamp, Roel F.; Calus, Mario P. L. (2016). Data from: An equation to predict the accuracy of genomic values by combining data from multiple traits, populations, or environments [Dataset]. Dryad. https://doi.org/10.5061/dryad.1525t

Abstract

Predicting the accuracy of estimated genomic values using genome-wide marker information is an important step in designing training populations. Currently, different deterministic equations are available to predict accuracy within populations, but not for multipopulation scenarios where data from multiple breeds, lines or environments are combined. Therefore, our objective was to develop and validate a deterministic equation to predict the accuracy of genomic values when different populations are combined in one training population. The input parameters of the derived prediction equation are the number of individuals and the heritability from each of the populations in the training population; the genetic correlations between the populations, i.e., the correlation between allele substitution effects of quantitative trait loci; the effective number of chromosome segments across predicted and training populations; and the proportion of the genetic variance in the predicted population captured by the markers in each of the training populations. Validation was performed based on real genotype information of 1033 Holstein–Friesian cows that were divided into three different populations by combining half-sib families in the same population. Phenotypes were simulated for multiple scenarios, differing in heritability within populations and in genetic correlations between the populations. Results showed that the derived equation can accurately predict the accuracy of estimating genomic values for different scenarios of multipopulation genomic prediction. Therefore, the derived equation can be used to investigate the potential accuracy of different multipopulation genomic prediction scenarios and to decide on the most optimal design of training populations.

Usage notes