Discovery of novel transcription factor binding sites by statistical overrepresentation
Open Access
- 15 December 2002
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 30 (24) , 5549-5560
- https://doi.org/10.1093/nar/gkf669
Abstract
Understanding the complex and varied mechanisms that regulate gene expression is an important and challenging problem. A fundamental sub‐problem is to identify DNA binding sites for unknown regulatory factors, given a collection of genes believed to be co‐regulated. We discuss a computational method that identifies good candidates for such binding sites. Unlike local search techniques such as expectation maximization and Gibbs samplers that may not reach a global optimum, the method discussed enumerates all motifs in the search space, and is guaranteed to produce the motifs with greatest z‐scores. We discuss the results of validation experiments in which this algorithm was used to identify candidate binding sites in several well studied regulons of Saccharomyces cerevisiae, where the most prominent transcription factor binding sites are largely known. We then discuss the results on gene families in the functional and mutant phenotype catalogs of S.cerevisiae, where the algorithm suggests many promising novel transcription factor binding sites. The program is available at http://bio.cs.washington.edu/software.html.Keywords
This publication has 39 references indexed in Scilit:
- Separating real motifs from their artifactsBioinformatics, 2001
- An algorithm for finding signals of unknown length in DNA sequencesBioinformatics, 2001
- DDSE: downstream targets of the SNF3 signal transduction pathwayFEMS Microbiology Letters, 2001
- Discovering regulatory elements in non-coding sequences by analysis of spaced dyadsNucleic Acids Research, 2000
- Experimental analysis and computer prediction of CTF/NFI transcription factor DNA binding sites 1 1Edited by M. YanivJournal of Molecular Biology, 2000
- Systematic Management and Analysis of Yeast Gene Expression DataGenome Research, 2000
- Computational identification of Cis -regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae 1 1Edited by F. E. CohenJournal of Molecular Biology, 2000
- Overproduction of the Opi1 repressor inhibits transcriptional activation of structural genes required for phospholipid biosynthesis in the yeastSaccharomyces cerevisiaeYeast, 1999
- An exact method for finding short motifs in sequences, with application to the ribosome binding site problem.1999
- Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitationNature Biotechnology, 1998