Identification of functional clusters of transcription factor binding motifs in genome sequences: the MSCAN algorithm
Open Access
- 3 July 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 19 (suppl_1) , i169-i176
- https://doi.org/10.1093/bioinformatics/btg1021
Abstract
Motivation:The identification of regulatory control regions within genomes is a major challenge. Studies have demonstrated that regulating regions can be described as locally dense clusters or modules of cis-acting transcription factor binding sites (TFBS). For well-described biological contexts, it is possible to train predictive algorithms to discern novel modules in genome sequences. However, utility of module detection methods has been severely limited by insufficient training data. For only a few tissues can one obtain sufficient numbers of literature-derived regulatory modules. Results: We present a novel method, MSCAN, that circumvents the training data problem by measuring the statistical significance of any non-overlapping combination of TFBS in a window. Given a set of transcription factor binding profiles, a significance threshold, and a genomic sequence, MSCAN returns putative regulatory regions. We assess performance on two curated collections of regulatory regions; one each for tissue-specific expression in liver and skeletal muscle cells. The efficiency of MSCAN allows for predictive screens of entire genomes. Availability: http://tfscan.cgb.ki.se/cgi-bin/MSCAN Contact: wyeth@cmmt.ubc.ca Keywords: transcription, gene networks, modules, motif, promoter.Keywords
This publication has 0 references indexed in Scilit: