Maximally Efficient Modeling of DNA Sequence Motifs at All Levels of Complexity
- 1 April 2011
- journal article
- research article
- Published by Oxford University Press (OUP) in Genetics
- Vol. 187 (4) , 1219-1224
- https://doi.org/10.1534/genetics.110.126052
Abstract
Identification of transcription factor binding sites is necessary for deciphering gene regulatory networks. Several new methods provide extensive data about the specificity of transcription factors but most methods for analyzing these data to obtain specificity models are limited in scope by, for example, assuming additive interactions or are inefficient in their exploration of more complex models. This article describes an approach—encoding of DNA sequences as the vertices of a regular simplex—that allows simultaneous direct comparison of simple and complex models, with higher-order parameters fit to the residuals of lower-order models. In addition to providing an efficient assessment of all model parameters, this approach can yield valuable insight into the mechanism of binding by highlighting features that are critical to accurate models.Keywords
This publication has 19 references indexed in Scilit:
- Determining the specificity of protein–DNA interactionsNature Reviews Genetics, 2010
- Inferring Binding Energies from Selected Binding SitesPLoS Computational Biology, 2009
- A Feature-Based Approach to Modeling Protein–DNA InteractionsPLoS Computational Biology, 2008
- Putting numbers on the network connectionsBioEssays, 2007
- Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCEBioinformatics, 2006
- Modeling within-motif dependence for transcription factor binding site predictionsBioinformatics, 2004
- A Biophysical Approach to Transcription Factor Binding Site DiscoveryGenome Research, 2003
- Is there a code for protein–DNA recognition? Probab(ilistical)ly…BioEssays, 2002
- A weight array method for splicing signal analysisBioinformatics, 1993
- Selection of DNA binding sites by regulatory proteinsJournal of Molecular Biology, 1987