Predicting transcription factor binding sites using local over-representation and comparative genomics
Open Access
- 31 August 2006
- journal article
- software
- Published by Springer Nature in BMC Bioinformatics
- Vol. 7 (1) , 396
- https://doi.org/10.1186/1471-2105-7-396
Abstract
Identifying cis-regulatory elements is crucial to understanding gene expression, which highlights the importance of the computational detection of overrepresented transcription factor binding sites (TFBSs) in coexpressed or coregulated genes. However, this is a challenging problem, especially when considering higher eukaryotic organisms. We have developed a method, named TFM-Explorer, that searches for locally overrepresented TFBSs in a set of coregulated genes, which are modeled by profiles provided by a database of position weight matrices. The novelty of the method is that it takes advantage of spatial conservation in the sequence and supports multiple species. The efficiency of the underlying algorithm and its robustness to noise allow weak regulatory signals to be detected in large heterogeneous data sets. TFM-Explorer provides an efficient way to predict TFBS overrepresentation in related sequences. Promising results were obtained in a variety of examples in human, mouse, and rat genomes. The software is publicly available at http://bioinfo.lifl.fr/TFM-Explorer .Keywords
This publication has 28 references indexed in Scilit:
- A Model of the Statistical Power of Comparative Genome Sequence AnalysisPLoS Biology, 2005
- Assessing computational tools for the discovery of transcription factor binding sitesNature Biotechnology, 2005
- NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteinsNucleic Acids Research, 2004
- Applied bioinformatics for the identification of regulatory elementsNature Reviews Genetics, 2004
- Determination of Local Statistical Significance of Patterns in Markov Sequences with Application to Promoter Element IdentificationJournal of Computational Biology, 2004
- The UCSC Genome Browser DatabaseNucleic Acids Research, 2003
- Algorithms for Extracting Structured Motifs Using a Suffix Tree with an Application to Promoter and Regulatory Site Consensus IdentificationJournal of Computational Biology, 2000
- Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies 1 1Edited by G. von HeijneJournal of Molecular Biology, 1998
- Identification of regulatory regions which confer muscle-specific gene expressionJournal of Molecular Biology, 1998
- The statistical significance of nucleotide position-weight matrix matchesBioinformatics, 1996