Many investigators are now using haplotype-tagging single-nucleotide polymorphism (htSNPs) as a way of screening regions of the genome for association with disease. A common approach is to genotype htSNPs in a study population and to use this information to draw inferences about each individual’s haplotypic makeup, including SNPs that were not directly genotyped. To test the validity of this approach, we simulated the exercise of typing htSNPs in a large sample of individuals and compared the true and inferred haplotypes. The accuracy of haplotype inference varied, depending on the method of selecting htSNPs, the linkage-disequilibrium structure of the region, and the amount of missing data. At the stage of selection of htSNPs, haplotype-block-based methods required a larger number of htSNPs than did unstructured methods but gave lower levels of error in haplotype inference, particularly when there was a significant amount of missing data. We present a Web-based utility that allows investigators to compare the likely error rates of different sets of htSNPs and to arrive at an economical set of htSNPs that provides acceptable levels of accuracy in haplotype inference.


Journal article


Am J Hum Genet

Publication Date





438 - 448