Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Upstream open reading frames (uORFs) are tissue-specific cis-regulators of protein translation. Isolated reports have shown that variants that create or disrupt uORFs can cause disease. Here, in a systematic genome-wide study using 15,708 whole genome sequences, we show that variants that create new upstream start codons, and variants disrupting stop sites of existing uORFs, are under strong negative selection. This selection signal is significantly stronger for variants arising upstream of genes intolerant to loss-of-function variants. Furthermore, variants creating uORFs that overlap the coding sequence show signals of selection equivalent to coding missense variants. Finally, we identify specific genes where modification of uORFs likely represents an important disease mechanism, and report a novel uORF frameshift variant upstream of NF2 in neurofibromatosis. Our results highlight uORF-perturbing variants as an under-recognised functional class that contribute to penetrant human disease, and demonstrate the power of large-scale population sequencing data in studying non-coding variant classes.

Original publication

DOI

10.1038/s41467-019-10717-9

Type

Journal article

Journal

Nature communications

Publication Date

05/2020

Volume

11

Addresses

National Heart and Lung Institute and MRC London Institute of Medical Sciences, Imperial College London, Du Cane Road, London, W12 0NN, UK. n.whiffin@imperial.ac.uk.

Keywords

Genome Aggregation Database Production Team, Genome Aggregation Database Consortium, Humans, Proteins, 5' Untranslated Regions, Base Sequence, Open Reading Frames, Genome, Human, Genetic Variation, Loss of Function Mutation