Measuring similarities between transcription factor binding sites
Open Access
- 28 September 2005
- journal article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 6 (1) , 237
- https://doi.org/10.1186/1471-2105-6-237
Abstract
Background: Collections of transcription factor binding profiles (Transfac, Jaspar) are essential to identify regulatory elements in DNA sequences. Subsets of highly similar profiles complicate large scale analysis of transcription factor binding sites. Results: We propose to identify and group similar profiles using two independent similarity measures: χ2 distances between position frequency matrices (PFMs) and correlation coefficients between position weight matrices (PWMs) scores. Conclusion: We show that these measures complement each other and allow to associate Jaspar and Transfac matrices. Clusters of highly similar matrices are identified and can be used to optimise the search for regulatory elements. Moreover, the application of the measures is illustrated by assigning E-box matrices of a SELEX experiment and of experimentally characterised binding sites of circadian clock genes to the Myc-Max cluster.Keywords
This publication has 40 references indexed in Scilit:
- Applied bioinformatics for the identification of regulatory elementsNature Reviews Genetics, 2004
- Constrained Binding Site Diversity within Families of Transcription Factors Enhances Pattern Discovery BioinformaticsJournal of Molecular Biology, 2004
- Timing the cell cycleNature Cell Biology, 2003
- Circadian TranscriptionPublished by Elsevier ,2002
- Genome-Wide Location and Function of DNA Binding ProteinsScience, 2000
- Human-mouse genome comparisons to locate regulatory sitesNature Genetics, 2000
- Computational identification of Cis -regulatory elements associated with groups of functionally related genes in Saccharomyces cerevisiae 1 1Edited by F. E. CohenJournal of Molecular Biology, 2000
- Role of the CLOCK Protein in the Mammalian Circadian MechanismScience, 1998
- The basic-helix–loop–helix-PAS orphan MOP3 forms transcriptionally active complexes with circadian and hypoxia factorsProceedings of the National Academy of Sciences, 1998
- Information content of binding sites on nucleotide sequencesJournal of Molecular Biology, 1986