Computational Rare Disease Genomics Group

About us

We use computational approaches and large genomic datasets to uncover novel genetic variants that cause rare disease, and understand the mechanisms through which they do so.

Clinical genetic testing has become commonplace for individuals with rare disorders. Identifying the genetic cause of a disorder is of huge benefit to both patients and their families; allowing us to screen additional family members to identify those also at risk, return an accurate diagnosis to the patient, and dictate personalised treatment approaches. Through current approaches, however, we only find a genetic diagnosis in around half of all individuals with rare disorders. These approaches almost exclusively focus on the regions of the genome that code directly for proteins.

We believe that a subset of the undiagnosed patients will have genetic variants in regions outside of this protein-coding sequence that have crucial roles in regulating the amount of proteins that are produced. Our aim is to identify these disease-causing regulatory variants and determine how they lead to disease. By identifying these variants, we hope to influence clinical genetic testing guidelines and allow a valuable genetic diagnosis to be returned to more rare disease patients.

We are particularly fascinated by untranslated regions and aim to understand how and when genetic variants within them cause rare disorders. We hope that this knowledge will also help us design new therapeutic approaches that modulate UTR-mediated regulation.

Untranslated regions (UTRs) are the regions of a gene directly up- and down-stream of the protein coding region. They form part of the mRNA molecule, but are not translated into protein. These regions have very important regulatory roles: they control the stability of the RNA, the location of it within the cell, and the rate at which it is translated into protein. Variants within these regions that affect these regulatory processes can therefore have a large impact, however, we are currently limited in our ability to predict the likely effect of any single variant. We aim to identify subsets of UTR variants that are deleterious and create tools and resources to help find and interpret these variants.

Data

We mainly analyse publicly available large-scale genomic datasets including:

In addition, we collaborate with others to access large patient cohorts including the Centre for Mendelian Genomics at the Broad Institute (through collaboration with Anne O’Donnell Luria and Heidi Rehm) and the Deciphering Developmental Disorders (DDD) cohort at the Wellcome Sanger Institute (through collaboration with Matt Hurles, Caroline Wright and Hillary Martin).

Approaches

Using population cohorts to identify specific variants in functional non-coding elements that show signals of being under strong negative selection, indicating that they are likely deleterious
Identifying disease-causing non-coding region variants in individuals with rare disorders
Creating tools and resources to improve annotation of functional non-coding variants
Developing clinical guidelines to support interpretation of non-coding region variants
Understanding more about UTR-mediated gene regulation and how this may be modified therapeutically