Discriminative Motifs
- 1 June 2003
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 10 (3-4) , 599-615
- https://doi.org/10.1089/10665270360688219
Abstract
This paper takes a new view of motif discovery, addressing a common problem in existing motif finders. A motif is treated as a feature of the input promoter regions that leads to a good classifier between these promoters and a set of background promoters. This perspective allows us to adapt existing methods of feature selection, a well-studied topic in machine learning, to motif discovery. We develop a general algorithmic framework that can be specialized to work with a wide variety of motif models, including consensus models with degenerate symbols or mismatches, and composite motifs. A key feature of our algorithm is that it measures overrepresentation while maintaining information about the distribution of motif instances in individual promoters. The assessment of a motif's discriminative power is normalized against chance behaviour by a probabilistic analysis. We apply our framework to two popular motif models and are able to detect several known binding sites in sets of co-regulated genes in yeast.Keywords
This publication has 12 references indexed in Scilit:
- Identifying target sites for cooperatively binding factorsBioinformatics, 2001
- Tissue Classification with Gene Expression ProfilesJournal of Computational Biology, 2000
- Combinatorial motif analysis and hypothesis generation on a genomic scaleBioinformatics, 2000
- Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes.Bioinformatics, 1999
- SCPD: a promoter database of the yeast Saccharomyces cerevisiae.Bioinformatics, 1999
- Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitationNature Biotechnology, 1998
- Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies 1 1Edited by G. von HeijneJournal of Molecular Biology, 1998
- Synergy between Interferon-γ and Tumor Necrosis Factor-α in Transcriptional Activation Is Mediated by Cooperation between Signal Transducer and Activator of Transcription 1 and Nuclear Factor κBJournal of Biological Chemistry, 1997
- Unsupervised learning of multiple motifs in biopolymers using expectation maximizationMachine Learning, 1995
- Detecting Subtle Sequence Signals: a Gibbs Sampling Strategy for Multiple AlignmentScience, 1993