Using hexamers to predict cis-regulatory motifs in Drosophila
Open Access
- 27 October 2005
- journal article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 6 (1) , 262
- https://doi.org/10.1186/1471-2105-6-262
Abstract
Cis-regulatory modules (CRMs) are short stretches of DNA that help regulate gene expression in higher eukaryotes. They have been found up to 1 megabase away from the genes they regulate and can be located upstream, downstream, and even within their target genes. Due to the difficulty of finding CRMs using biological and computational techniques, even well-studied regulatory systems may contain CRMs that have not yet been discovered. We present a simple, efficient method (HexDiff) based only on hexamer frequencies of known CRMs and non-CRM sequence to predict novel CRMs in regulatory systems. On a data set of 16 gap and pair-rule genes containing 52 known CRMs, predictions made by HexDiff had a higher correlation with the known CRMs than several existing CRM prediction algorithms: Ahab, Cluster Buster, MSCAN, MCAST, and LWF. After combining the results of the different algorithms, 10 putative CRMs were identified and are strong candidates for future study. The hexamers used by HexDiff to distinguish between CRMs and non-CRM sequence were also analyzed and were shown to be enriched in regulatory elements. HexDiff provides an efficient and effective means for finding new CRMs based on known CRMs, rather than known binding sites.Keywords
This publication has 34 references indexed in Scilit:
- Embryonic ε and γ globin genes of a prosimian primate (Galago crassicaudatus): Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprintsPublished by Elsevier ,2004
- Transcriptional Control in the Segmentation Gene Network of DrosophilaPLoS Biology, 2004
- Prediction of similarly acting cis-regulatory modules by subsequence profiling and comparative genomics in Drosophila melanogaster and D.pseudoobscuraBioinformatics, 2004
- Long-range activation of Sox9 in Odd Sex (Ods) miceHuman Molecular Genetics, 2004
- rVistafor Comparative Sequence-Based Discovery of Functional Transcription Factor Binding SitesGenome Research, 2002
- Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genomeProceedings of the National Academy of Sciences, 2002
- Computational identification of Cis -regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae 1 1Edited by F. E. CohenJournal of Molecular Biology, 2000
- Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies 1 1Edited by G. von HeijneJournal of Molecular Biology, 1998
- Identification of regulatory regions which confer muscle-specific gene expressionJournal of Molecular Biology, 1998
- The prediction of vertebrate promoter regions using differential hexamer frequency analysisBioinformatics, 1996