Recognition of the four Watson–Crick base pairs in the DNA minor groove by synthetic ligands

Abstract
The design of synthetic ligands that read the information stored in the DNA double helix has been a long-standing goal at the interface of chemistry and biology1,2,3,4,5. Cell-permeable small molecules that target predetermined DNA sequences offer a potential approach for the regulation of gene expression6. Oligodeoxynucleotides that recognize the major groove of double-helical DNA via triple-helix formation bind to a broad range of sequences with high affinity and specificity3,4. Although oligonucleotides and their analogues have been shown to interfere with gene expression7,8, the triple-helix approach is limited to recognition of purines and suffers from poor cellular uptake. The subsequent development of pairing rules for minor-groove binding polyamides containing pyrrole (Py) and imidazole (Im) amino acids offers a second code to control sequence specificity9,10,11. An Im/Py pair distinguishes G·C from C·G and both of these from A·T/T·A base pairs9,10,11. A Py/Py pair specifies A,T from G,C but does not distinguish A·T from T·A9,10,11,12,13,14. To break this degeneracy, we have added a new aromatic amino acid, 3-hydroxypyrrole (Hp), to the repertoire to test for pairings that discriminate A·T from T·A. We find that replacement of a single hydrogen atom with a hydroxy group in a Hp/Py pairing regulates affinity and specificity by an order of magnitude. By incorporation of this third amino acid, hydroxypyrrole–imidazole–pyrrole polyamides form four ring-pairings (Im/Py, Py/Im, Hp/Py and Py/Hp) which distinguish all four Watson–Crick base pairs in the minor groove of DNA.