MOTIVATION: Genetic variation at classical HLA alleles influences many phenotypes, including susceptibility to autoimmune disease, resistance to pathogens and the risk of adverse drug reactions. However, classical HLA typing methods are often prohibitively expensive for large-scale studies. We previously described a method for imputing classical alleles from linked SNP genotype data. Here, we present a modification of the original algorithm implemented in a freely available software suite that combines local data preparation and QC with probabilistic imputation through a remote server. RESULTS: We introduce two modifications to the original algorithm. First, we present a novel SNP selection function that leads to pronounced increases (up by 40% in some scenarios) in call rate. Second, we develop a parallelized model building algorithm that allows us to process a reference set of over 2500 individuals. In a validation experiment, we show that our framework produces highly accurate HLA type imputations at class I and class II loci for independent datasets: at call rates of 95-99%, imputation accuracy is between 92% and 98% at the four-digit level and over 97% at the two-digit level. We demonstrate utility of the method through analysis of a genome-wide association study for psoriasis where there is a known classical HLA risk allele (HLA-C*06:02). We show that the imputed allele shows stronger association with disease than any single SNP within the region. The imputation framework, HLA*IMP, provides a powerful tool for dissecting the architecture of genetic risk within the HLA. AVAILABILITY: HLA*IMP, implemented in C++ and Perl, is available from http://oxfordhla.well.ox.ac.uk and is free for academic use.

Original publication

DOI

10.1093/bioinformatics/btr061

Type

Journal article

Journal

Bioinformatics

Publication Date

01/04/2011

Volume

27

Pages

968 - 972

Keywords

Algorithms, Alleles, Genome-Wide Association Study, Genotype, HLA Antigens, Humans, Polymorphism, Single Nucleotide, Psoriasis, Software