Skip to main content
Dryad

Pleiotropy of UK Biobank metabolites

Cite this dataset

Smith, Courtney et al. (2022). Pleiotropy of UK Biobank metabolites [Dataset]. Dryad. https://doi.org/10.5061/dryad.79cnp5hxs

Abstract

Pleiotropy and genetic correlation are widespread features in GWAS, but they are often difficult to interpret at the molecular level. Here, we perform GWAS of 16 metabolites clustered at the intersection of amino acid catabolism, glycolysis, and ketone body metabolism in a subset of UK Biobank. We utilize the well-documented biochemistry jointly impacting these metabolites to analyze pleiotropic effects in the context of their pathways. Among the 213 lead GWAS hits, we find a strong enrichment for genes encoding pathway-relevant enzymes and transporters. We demonstrate that the effect directions of variants acting on biology between metabolite pairs often contrast with those of upstream or downstream variants as well as the polygenic background. Thus, we find that these outlier variants often reflect biology local to the traits. Finally, we explore the implications for interpreting disease GWAS, underscoring the potential of unifying biochemistry with dense metabolomics data to understand the molecular basis of pleiotropy in complex traits and diseases.

Methods

The details of the dataset processing are provided in our manuscript: https://elifesciences.org/articles/79348

Briefly, we performed GWAS of technically-corrected metabolite levels from the Nightingale NMR Metabolomics dataset on 94,464 European-ancestry individuals and 98,189 individuals in our ancestry-inclusive analysis using BOLT-REML and integrated these results with a curated biochemical map connecting the 16 core metabolites spanning glycolysis, ketones, and amino acids.

Files with names "*_step3.txt" and "*_step2.txt" are the local genetic correlation and local heritability estimates for each approximately independent LD block (Berisa et al. 2016) using rho-HESS (Shi et al. 2017) and HESS (Shi et al. 2016), respectively. These were derived from European-ancestry summary statistics.

Files with names that start with a SNP identifier, "both," or "neither" are the conditional fine-mapping summary statistics from our example loci, generated with the PLINK2 "--condition" option. Please see the manucript for additional details.

Usage notes

These data are in standard compressed text format and specialty software is not necessary to open the files. For help with using these data, please reach out to the corresponding authors of the corresponding manuscript: https://elifesciences.org/articles/79348

Funding

National Human Genome Research Institute, Award: 5R01HG011432

National Institute on Aging, Award: 5R01AG066490