Skip to main content

Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle

Cite this dataset

Tiplady, Kathryn et al. (2021). Sequence-based genome-wide association study of individual milk mid-infrared wavenumbers in mixed-breed dairy cattle [Dataset]. Dryad.


Fourier-transform mid-infrared (FT-MIR) spectroscopy provides a high-throughput and inexpensive method for predicting milk composition and other novel traits from milk samples. Whilst there have been many genome-wide association studies (GWAS) conducted on FT-MIR predicted traits, there have been few GWAS for individual FT-MIR wavenumbers. Here we examine associations between genomic regions and individual FT-MIR wavenumber phenotypes within a population of 38,085 mixed-breed New Zealand dairy cattle with imputed whole-genome sequence. GWAS were conducted for each of 895 individual FT-MIR wavenumber phenotypes and three FT-MIR predicted milk composition traits, and gene annotation and mammary tissue gene expression datasets were employed to identify candidate causative genes and variants. This resulted in the identification of 38 co-locating, co-segregating expression QTL (eQTL), and 31 protein-sequence mutations for FT-MIR wavenumber phenotypes, the latter including a null mutation in ABO that has a potential role in changing milk oligosaccharide profiles. For the candidate causative genes implicated in these analyses, the strength of association between relevant loci and each wavenumber across the mid-infrared spectrum revealed shared association patterns for groups of genomically-distant loci, highlighting clusters of loci linked through their biological roles in lactation and their presumed impacts on the chemical composition of milk.


Adjusted FT-MIR spectra records for 38,085 multi-breed and crossbred New Zealand dairy cows were derived from 100,571 FT-MIR spectra records from individual milk samples collected as part of routine herd testing conducted by Livestock Improvement Corporation (LIC) in the 2017/18 season.

Imputed whole-genome sequence genotypes were generated using a stepwise imputation approach via panels of 50k and HD density.

Usage notes
Mid-infrared spectra wavenumber phenotypes for 38,085 mixed breed New Zealand dairy cattle.
Imputed genotypes for variants representing significant trait QTL, for 38,085 animals.
Relevant 1Mbp cis-eQTL for 25 genes with co-locating trait and expression QTL peaks.
Description of files provided.


Livestock Improvement Corporation

Ministry for Primary Industries, Award: Sustainable Food & Fibre Futures (Funding no: PGP06-17006)

Livestock Improvement Corporation