Variable prediction accuracy of polygenic scores within an ancestry group

Mostafavi, Hakhamanesh 1 ; Harpak, Arbel 1 ; Agarwal, Ipsita 1 ; Conley, Dalton 2 ; Pritchard, Jonathan3 ; Przeworski, Molly1

Published Feb 24, 2020 on Dryad. https://doi.org/10.5061/dryad.66t1g1jxs

Data files

Feb 24, 2020 version files 18.88 GB

gwas_by_sample_characteristics.tar.gz

10.60 GB
README.txt

497 B
standard_vs_sibling_gwas.tar.gz

8.28 GB

Abstract

Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.

Variable prediction accuracy of polygenic scores within an ancestry group

Data files

Abstract

Usage notes

Works referencing this dataset