Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Polygenic prediction using genome-wide SNPs can provide high prediction accuracy for complex traits. Here, we investigate the question of how to account for genetic ancestry when conducting polygenic prediction. We show that the accuracy of polygenic prediction in structured populations may be partly due to genetic ancestry. However, we hypothesized that explicitly modeling ancestry could improve polygenic prediction accuracy. We analyzed three GWAS of hair color (HC), tanning ability (TA), and basal cell carcinoma (BCC) in European Americans (sample size from 7,440 to 9,822) and considered two widely used polygenic prediction approaches: polygenic risk scores (PRSs) and best linear unbiased prediction (BLUP). We compared polygenic prediction without correction for ancestry to polygenic prediction with ancestry as a separate component in the model. In 10-fold cross-validation using the PRS approach, the R(2) for HC increased by 66% (0.0456-0.0755; P < 10(-16)), the R(2) for TA increased by 123% (0.0154 to 0.0344; P < 10(-16)), and the liability-scale R(2) for BCC increased by 68% (0.0138-0.0232; P < 10(-16)) when explicitly modeling ancestry, which prevents ancestry effects from entering into each SNP effect and being overweighted. Surprisingly, explicitly modeling ancestry produces a similar improvement when using the BLUP approach, which fits all SNPs simultaneously in a single variance component and causes ancestry to be underweighted. We validate our findings via simulations, which show that the differences in prediction accuracy will increase in magnitude as sample sizes increase. In summary, our results show that explicitly modeling ancestry can be important in both PRS and BLUP prediction.

More information Original publication

DOI

10.1002/gepi.21906

Type

Journal article

Publication Date

2015-09-01T00:00:00+00:00

Volume

39

Pages

427 - 438

Total pages

11

Addresses

D, e, p, a, r, t, m, e, n, t, , o, f, , E, p, i, d, e, m, i, o, l, o, g, y, ,, , H, a, r, v, a, r, d, , T, ., H, ., , C, h, a, n, , S, c, h, o, o, l, , o, f, , P, u, b, l, i, c, , H, e, a, l, t, h, ,, , B, o, s, t, o, n, ,, , M, a, s, s, a, c, h, u, s, e, t, t, s, ,, , U, n, i, t, e, d, , S, t, a, t, e, s, , o, f, , A, m, e, r, i, c, a, .

Keywords

Humans, Risk, Genotype, Multifactorial Inheritance, Phenotype, Polymorphism, Single Nucleotide, Principal Component Analysis, Models, Genetic, Genome-Wide Association Study, Basal Cell Carcinoma