Leveraging base-pair mammalian constraint to understand genetic variation and human disease.
Sullivan PF., Meadows JRS., Gazal S., Phan BN., Li X., Genereux DP., Dong MX., Bianchi M., Andrews G., Sakthikumar S., Nordin J., Roy A., Christmas MJ., Marinescu VD., Wang C., Wallerman O., Xue J., Yao S., Sun Q., Szatkiewicz J., Wen J., Huckins LM., Lawler A., Keough KC., Zheng Z., Zeng J., Wray NR., Li Y., Johnson J., Chen J., Zoonomia Consortium§ None., Paten B., Reilly SK., Hughes GM., Weng Z., Pollard KS., Pfenning AR., Forsberg-Nilsson K., Karlsson EK., Lindblad-Toh K.
Thousands of genomic regions have been associated with heritable human diseases, but attempts to elucidate biological mechanisms are impeded by an inability to discern which genomic positions are functionally important. Evolutionary constraint is a powerful predictor of function, agnostic to cell type or disease mechanism. Single-base phyloP scores from 240 mammals identified 3.3% of the human genome as significantly constrained and likely functional. We compared phyloP scores to genome annotation, association studies, copy-number variation, clinical genetics findings, and cancer data. Constrained positions are enriched for variants that explain common disease heritability more than other functional annotations. Our results improve variant annotation but also highlight that the regulatory landscape of the human genome still needs to be further explored and linked to disease.