A Gibbs Sampling Method to Detect Overrepresented Motifs in the Upstream Regions of Coexpressed Genes
- 1 April 2002
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 9 (2) , 447-464
- https://doi.org/10.1089/10665270252935566
Abstract
Microarray experiments can reveal important information about transcriptional regulation. In our case, we look for potential promoter regulatory elements in the upstream region of coexpressed genes. Here we present two modifications of the original Gibbs sampling algorithm for motif finding (Lawrence et al., 1993). First, we introduce the use of a probability distribution to estimate the number of copies of the motif in a sequence. Second, we describe the technical aspects of the incorporation of a higher-order background model whose application we discussed in Thijs et al. (2001). Our implementation is referred to as the Motif Sampler. We successfully validate our algorithm on several data sets. First, we show results for three sets of upstream sequences containing known motifs: 1) the G-box light-response element in plants, 2) elements involved in methionine response in Saccharomyces cerevisiae, and 3) the FNR O2-responsive element in bacteria. We use these data sets to explain the influence of the parameters on the performance of our algorithm. Second, we show results for upstream sequences from four clusters of coexpressed genes identified in a microarray experiment on wounding in Arabidopsis thaliana. Several motifs could be matched to regulatory elements from plant defence pathways in our database of plant cis-acting regulatory elements (PlantCARE). Some other strong motifs do not have corresponding motifs in PlantCARE but are promising candidates for further analysis.Keywords
This publication has 34 references indexed in Scilit:
- Computational identification of Cis -regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae 1 1Edited by F. E. CohenJournal of Molecular Biology, 2000
- Inferring regulatory elements from a whole genome. an analysis of Helicobacter pyloriσ80 family of promoter signalsJournal of Molecular Biology, 2000
- Improved microbial gene identification with GLIMMERNucleic Acids Research, 1999
- Evaluation of gene prediction software using a genomic data set: application to Arabidopsis thalianasequencesBioinformatics, 1999
- Regulatory elements and expression profilesCurrent Opinion in Structural Biology, 1999
- Interpolated markov chains for eukaryotic promoter recognition.Bioinformatics, 1999
- Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitationNature Biotechnology, 1998
- Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies 1 1Edited by G. von HeijneJournal of Molecular Biology, 1998
- Wound Signaling in Tomato Plants1Plant Physiology, 1998
- Bayesian Models for Multiple Local Sequence Alignment and Gibbs Sampling StrategiesJournal of the American Statistical Association, 1995