A systematic survey of loss-of-function variants in human protein-coding genes.
MacArthur DG., Balasubramanian S., Frankish A., Huang N., Morris J., Walter K., Jostins L., Habegger L., Pickrell JK., Montgomery SB., Albers CA., Zhang ZD., Conrad DF., Lunter G., Zheng H., Ayub Q., DePristo MA., Banks E., Hu M., Handsaker RE., Rosenfeld JA., Fromer M., Jin M., Mu XJ., Khurana E., Ye K., Kay M., Saunders GI., Suner M-M., Hunt T., Barnes IHA., Amid C., Carvalho-Silva DR., Bignell AH., Snow C., Yngvadottir B., Bumpstead S., Cooper DN., Xue Y., Romero IG., 1000 Genomes Project Consortium None., Wang J., Wang J., Li Y., Gibbs RA., McCarroll SA., Dermitzakis ET., Pritchard JK., Barrett JC., Harrow J., Hurles ME., Gerstein MB., Tyler-Smith C.
Genome-sequencing studies indicate that all humans carry many genetic variants predicted to cause loss of function (LoF) of protein-coding genes, suggesting unexpected redundancy in the human genome. Here we apply stringent filters to 2951 putative LoF variants obtained from 185 human genomes to determine their true prevalence and properties. We estimate that human genomes typically contain ~100 genuine LoF variants with ~20 genes completely inactivated. We identify rare and likely deleterious LoF alleles, including 26 known and 21 predicted severe disease-causing variants, as well as common LoF variants in nonessential genes. We describe functional and evolutionary differences between LoF-tolerant and recessive disease genes and a method for using these differences to prioritize candidate genes found in clinical sequencing studies.