Predicting functional transcription factor binding through alignment-free and affinity-based analysis of orthologous promoter sequences
Open Access
- 1 July 2008
- journal article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 24 (13) , i165-i171
- https://doi.org/10.1093/bioinformatics/btn154
Abstract
Motivation: The identification of transcription factor (TF) binding sites and the regulatory circuitry that they define is currently an area of intense research. Data from whole-genome chromatin immunoprecipitation (ChIP–chip), whole-genome expression microarrays, and sequencing of multiple closely related genomes have all proven useful. By and large, existing methods treat the interpretation of functional data as a classification problem (between bound and unbound DNA), and the analysis of comparative data as a problem of local alignment (to recover phylogenetic footprints of presumably functional elements). Both of these approaches suffer from the inability to model and detect low-affinity binding sites, which have recently been shown to be abundant and functional.Results: We have developed a method that discovers functional regulatory targets of TFs by predicting the total affinity of each promoter for those factors and then comparing that affinity across orthologous promoters in closely related species. At each promoter, we consider the minimum affinity among orthologs to be the fraction of the affinity that is functional. Because we calculate the affinity of the entire promoter, our method is independent of local alignment. By comparing with functional annotation information and gene expression data in Saccharomyces cerevisiae, we have validated that this biophysically motivated use of evolutionary conservation gives rise to dramatic improvement in prediction of regulatory connectivity and factor–factor interactions compared to the use of a single genome. We propose novel biological functions for several yeast TFs, including the factors Snt2 and Stb4, for which no function has been reported. Our affinity-based approach towards comparative genomics may allow a more quantitative analysis of the principles governing the evolution of non-coding DNA.Availability: The MatrixREDUCE software package is available from http://www.bussemakerlab.org/software/MatrixREDUCEContact: Harmen.Bussemaker@columbia.eduSupplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 47 references indexed in Scilit:
- TransfactomeDB: a resource for exploring the nucleotide sequence specificity and condition-specific regulatory activity of trans-acting factorsNucleic Acids Research, 2007
- Dissecting complex transcriptional responses using pathway-level scores based on prior informationBMC Bioinformatics, 2007
- Statistical mechanical modeling of genome-wide transcription factor occupancy data by MatrixREDUCEBioinformatics, 2006
- JProGO: a novel tool for the functional interpretation of prokaryotic microarray data using Gene Ontology informationNucleic Acids Research, 2006
- Extensive low-affinity transcriptional interactions in the yeast genomeGenome Research, 2006
- PhyloGibbs: A Gibbs Sampling Motif Finder That Incorporates PhylogenyPLoS Computational Biology, 2005
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003
- Conserved homeodomain proteins interact with MADS box protein Mcm1 to restrict ECB-dependent transcription to the M/G1 phase of the cell cycleGenes & Development, 2002
- SGD: Saccharomyces Genome DatabaseNucleic Acids Research, 1998
- Selection of DNA binding sites by regulatory proteinsJournal of Molecular Biology, 1987