Combining frequency and positional information to predict transcription factor binding sites

Abstract
Motivation: Even though a number of genome projects have been finished on the sequence level, still only a small proportion of DNA regulatory elements have been identified. Growing amounts of gene expression data provide the possibility of finding coregulated genes by clustering methods. By analysis of the promoter regions of those genes, rather weak signals of transcription factor binding sites may be detected. Results: We introduce the new algorithm ITB, an Integrated Tool for Box finding, which combines frequency and positional information to predict transcription factor binding sites in upstream regions of coregulated genes. Motifs are extracted by exhaustive analysis of regular expression-like patterns and by estimating probabilities of positional clusters of motifs. ITB detects consensus sequences of experimentally verified transcription factor binding sites of the yeast Saccharomyces cerevisiae. Moreover, a number of new binding site candidates with significant scores are predicted. Besides applying ITB on yeast upstream regions, the program is run on human promoter sequences. Availability: ITB is available upon request. Contact: s.kielbasa@itb.biologie.hu-berlin.de

This publication has 0 references indexed in Scilit: