Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

Mapping the landscape of possible macromolecular polymer sequences to their fitness in performing biological functions is a challenge across the biosciences. A paradigm is the case of aptamers, nucleic acids that can be selected to bind particular target molecules. We have characterized the sequence-fitness landscape for aptamers binding allophycocyanin (APC) protein via a novel Closed Loop Aptameric Directed Evolution (CLADE) approach. In contrast to the conventional SELEX methodology, selection and mutation of aptamer sequences was carried out in silico, with explicit fitness assays for 44,131 aptamers of known sequence using DNA microarrays in vitro. We capture the landscape using a predictive machine learning model linking sequence features and function and validate this model using 5500 entirely separate test sequences, which give a very high observed versus predicted correlation of 0.87. This approach reveals a complex sequence-fitness mapping, and hypotheses for the physical basis of aptameric binding; it also enables rapid design of novel aptamers with desired binding properties. We demonstrate an extension to the approach by incorporating prior knowledge into CLADE, resulting in some of the tightest binding sequences.

Original publication

DOI

10.1093/nar/gkn899

Type

Journal article

Journal

Nucleic acids research

Publication Date

01/2009

Volume

37

Addresses

Manchester Interdisciplinary Biocentre, The University of Manchester, Manchester, UK. chris.knight@manchester.ac.uk

Keywords

Phycocyanin, Oligonucleotide Array Sequence Analysis, Models, Statistical, Regression Analysis, Directed Molecular Evolution, Sequence Analysis, DNA, Artificial Intelligence, Aptamers, Nucleotide