A non-parametric model for transcription factor binding sites
- 1 October 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 31 (19) , 116e-116
- https://doi.org/10.1093/nar/gng117
Abstract
We introduce a non-parametric representation of transcription factor binding sites which can model arbitrary dependencies between positions. As two parameters are varied, this representation smoothly interpolates between the empirical distribution of binding sites and the standard position-specific scoring matrix (PSSM). In a test of generalization to unseen binding sites using 10-fold cross-validation on known binding sites for 95 TRANSFAC transcription factors, this representation outperforms PSSMs on between 65 and 89 of the 95 transcription factors, depending on the choice of the two adjustable parameters. We also discuss how the non- parametric representation may be incorporated into frameworks for finding binding sites given only a collection of unaligned promoter regions.Keywords
This publication has 19 references indexed in Scilit:
- A Statistical Model for Investigating Binding Probabilities of DNA Nucleotide Sequences Using MicroarraysBiometrics, 2002
- Transcriptional Regulatory Networks in Saccharomyces cerevisiaeScience, 2002
- An algorithm for finding protein–DNA binding sites with applications to chromatin- immunoprecipitation microarray experimentsNature Biotechnology, 2002
- Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factorsNucleic Acids Research, 2002
- Comprehensive Identification of Cell Cycle–regulated Genes of the YeastSaccharomyces cerevisiaeby Microarray HybridizationMolecular Biology of the Cell, 1998
- Detecting Subtle Sequence Signals: a Gibbs Sampling Strategy for Multiple AlignmentScience, 1993
- A weight array method for splicing signal analysisBioinformatics, 1993
- Systematic Evolution of Ligands by Exponential Enrichment: RNA Ligands to Bacteriophage T4 DNA PolymeraseScience, 1990
- Selection of DNA binding sites by regulatory proteinsJournal of Molecular Biology, 1987
- Quantitative analysis of the relationship between nucleotide sequence and functional activityNucleic Acids Research, 1986