Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.

Original publication

DOI

10.1038/nature09708

Type

Journal article

Journal

Nature

Publication Date

03/02/2011

Volume

470

Pages

59 - 65

Keywords

DNA Copy Number Variations, Gene Duplication, Genetic Predisposition to Disease, Genetics, Population, Genome, Human, Genomics, Genotype, Humans, Mutagenesis, Insertional, Reproducibility of Results, Sequence Analysis, DNA, Sequence Deletion