We describe a general method based on principal coordinates analysis to predict the effects of single-nucleotide polymorphisms within regulatory sequences on DNA-protein interactions. We use binding data for the transcription factor NF-kappaB as a test system. The method incorporates the effects of interactions between base pair positions in the binding site, and we demonstrate that such interactions are present for NF-kappaB. Prediction accuracy is higher than with profile models, confirmed by crossvalidation and by the experimental verification of our predictions for additional sequences. The binding affinities of all potential NF-kappaB sites on human chromosome 22, together with the effects of known single-nucleotide polymorphisms, are calculated to determine likely functional variants. We propose that this approach may be valuable, either on its own or in combination with other methods, when standard profile models are disadvantaged by complex internucleotide interactions.

Type

Journal article

Journal

Proc Natl Acad Sci U S A

Publication Date

2002

Volume

99

Pages

8167 - 8172