Skip to main content

Data from: Non-additive association analysis using proxy phenotypes identifies novel cattle sydromes

Cite this dataset

Reynolds, Edwardo; Littlejohn, Mathew (2021). Data from: Non-additive association analysis using proxy phenotypes identifies novel cattle sydromes [Dataset]. Dryad.


Mammalian species carry ~100 loss-of-function variants per individual, where ~1-5 of these impact essential genes and cause embryonic lethality or severe disease when homozygous. The functions of the remainder are more difficult to resolve, though are assumed to impact fitness in less manifest ways.  Here, we present data behind one of the largest sequence-resolution screens of cattle to date, targeting discovery and validation of non-additive effects in 130,725 animals. We highlight six novel recessive loci with impacts generally exceeding the largest effect variants identified from additive GWAS, presenting analogues of human diseases and hitherto unrecognised disorders. These loci present compelling missense (PLCD4, MTRF1, DPF2), premature stop (MUS81), and splice-disrupting mutations (GALNT2, FGD4), together explaining substantial proportions of inbreeding depression. These results demonstrate that the frequency distribution of deleterious alleles segregating in selected species can afford sufficient power to directly map novel disorders, presenting selection opportunities to minimise the incidence of genetic disease.


The datasets uploaded here include genotypes and phenotypes at various pre- and post-processing stages, more detailed information regarding generation and processing of these data can be found in the accompanying manuscript. A summary of the uploaded datasets is given in the README file, note that the DNA and RNA sequence datasets used for sequence imputation and splice QTL analysis have been uploaded separately to the NCBI Sequence Read Archive.

Usage notes

Genotype, phenotype, and pedigree data are uploaded as separate files and should be joined using the individual identifiers common to each relevant fileset. Note that the files have been named using phen_*, gen_*, other*, and source_data* nomenclature according to the numbers and descriptions in the three categories outlined above . A README file is uploaded as part of this submission that details exact file contents and usage.


Ministry of Business, Innovation and Employment, Award: CONT-57639-ENDRP-LIC

Ministry for Primary Industries, Award: Primary Growth Partnership (now Sustainable Food & Fibre Futures)