Skip to main content

Genome-wide association and genomic prediction for a reproductive index summarizing fertility outcomes in U.S. Holsteins

Cite this dataset

Seabury, Christopher (2023). Genome-wide association and genomic prediction for a reproductive index summarizing fertility outcomes in U.S. Holsteins [Dataset]. Dryad.


Subfertility represents one major challenge to enhancing dairy production and efficiency. Herein, we use a reproductive index (RI) expressing the predicted probability of pregnancy following artificial insemination with Illumina 778K genotypes to perform single and multi-locus genome-wide association analyses (GWAA) on 2,448 geographically diverse U.S. Holstein cows and produce genomic heritability estimates. Moreover, we use genomic best linear unbiased prediction (GBLUP) to investigate the potential utility of the RI by performing genomic predictions with cross-validation. Notably, genomic heritability estimates for the U.S. Holstein RI were moderate ( 0.1654± 0.0317 – 0.2550 ± 0.0348), while single and multi-locus GWAA revealed overlapping quantitative trait loci (QTL) on BTA6 and BTA29, including known QTL for daughter pregnancy rate (DPR) and cow conception rate (CCR). Multi-locus GWAA revealed seven additional QTL, including one on BTA7 (60 Mb) which is adjacent to a known heifer conception rate (HCR) QTL (59 Mb). Positional candidate genes for the detected QTL included male and female fertility loci (i.e., spermatogenesis, oogenesis), meiotic and mitotic regulators, and genes associated with immune response, milk yield, enhanced pregnancy rates, and the reproductive-longevity pathway.  Based on the proportion of phenotypic variance explained (PVE), all detected QTL (n = 13; P ≤ 5e-05) were estimated to have moderate (1.0% < PVE ≤ 2.0%) or small effects (PVE ≤ 1.0%) on the predicted probability of pregnancy.  Genomic prediction using GBLUP with cross-validation (k = 3) produced mean predictive abilities (0.1692–0.2301) and mean genomic prediction accuracies (0.4119–0.4557) that were similar to bovine health and production traits previously investigated.


Summary data from single-locus and multi-locus genome wide association analyses are summarized in Additional File 1 and Additional File 2, respectively.

Summary data from GBLUP-based genomic predictions with k-fold cross-validation are summarized in Additional File 3.  Summary statistics for milk yield (up to 90 days of lactation, MAVG90kg) in high-fertility and low-fertility individuals are described in Additional File 4.

Usage notes

The files are annotated in detail and self-explanatory when evaluated within the context of the manuscript.  See the ReadMe.


National Institute of Food and Agriculture, Award: 2013-68004