Variation in the human genome sequence is key to understanding susceptibility to disease in modern populations and the history of ancestral populations. Unlocking this information requires knowledge of the patterns and underlying causes of human sequence diversity. By applying a new population-genetic framework to two genome-wide polymorphism surveys, we find that the human genome contains sizeable regions (stretching over tens of thousands of base pairs) that have intrinsically high and low rates of sequence variation. We show that the primary determinant of these patterns is shared genealogical history. Only a fraction of the variation (at most 25%) is due to the local mutation rate. By measuring the average distance over which genealogical histories are typically preserved, these data provide the first genome-wide estimate of the average extent of correlation among variants (linkage disequilibrium). The results are best explained by extreme variability in the recombination rate at a fine scale, and provide the first empirical evidence that such recombination 'hot spots' are a general feature of the human genome and have a principal role in shaping genetic variation in the human population.

Original publication

DOI

10.1038/ng947

Type

Journal article

Journal

Nat Genet

Publication Date

09/2002

Volume

32

Pages

135 - 142

Keywords

Animals, Computer Simulation, Evolution, Molecular, Genetic Variation, Genome, Human, Humans, Linkage Disequilibrium, Mutation, Pan troglodytes, Polymorphism, Single Nucleotide, Recombination, Genetic