Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Accurate prediction of an individual's phenotype from their DNA sequence is one of the great promises of genomics and precision medicine. We extend a powerful individual-level data Bayesian multiple regression model (BayesR) to one that utilises summary statistics from genome-wide association studies (GWAS), SBayesR. In simulation and cross-validation using 12 real traits and 1.1 million variants on 350,000 individuals from the UK Biobank, SBayesR improves prediction accuracy relative to commonly used state-of-the-art summary statistics methods at a fraction of the computational resources. Furthermore, using summary statistics for variants from the largest GWAS meta-analysis (n ≈ 700, 000) on height and BMI, we show that on average across traits and two independent data sets that SBayesR improves prediction R2 by 5.2% relative to LDpred and by 26.5% relative to clumping and p value thresholding.

Original publication




Journal article


Nature communications

Publication Date





Institute for Molecular Bioscience, University of Queensland, St Lucia, Brisbane, 4072, QLD, Australia.


Adipose Tissue, Humans, Alopecia, Diabetes Mellitus, Type 2, Birth Weight, Basal Metabolism, Vital Capacity, Forced Expiratory Volume, Body Mass Index, Body Height, Waist-Hip Ratio, Bayes Theorem, Regression Analysis, Body Composition, Bone Density, Multifactorial Inheritance, Polymorphism, Single Nucleotide, Biological Specimen Banks, Statistics as Topic, Genome-Wide Association Study, Genetic Association Studies