A Nucleosome-Guided Map of Transcription Factor Binding Sites in Yeast
Open Access
- 9 November 2007
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 3 (11) , e215
- https://doi.org/10.1371/journal.pcbi.0030215
Abstract
Finding functional DNA binding sites of transcription factors (TFs) throughout the genome is a crucial step in understanding transcriptional regulation. Unfortunately, these binding sites are typically short and degenerate, posing a significant statistical challenge: many more matches to known TF motifs occur in the genome than are actually functional. However, information about chromatin structure may help to identify the functional sites. In particular, it has been shown that active regulatory regions are usually depleted of nucleosomes, thereby enabling TFs to bind DNA in those regions. Here, we describe a novel motif discovery algorithm that employs an informative prior over DNA sequence positions based on a discriminative view of nucleosome occupancy. When a Gibbs sampling algorithm is applied to yeast sequence-sets identified by ChIP-chip, the correct motif is found in 52% more cases with our informative prior than with the commonly used uniform prior. This is the first demonstration that nucleosome occupancy information can be used to improve motif discovery. The improvement is dramatic, even though we are using only a statistical model to predict nucleosome occupancy; we expect our results to improve further as high-resolution genome-wide experimental nucleosome occupancy data becomes increasingly available. Identifying transcription factor (TF) binding sites across the genome is an important problem in molecular biology. Large-scale discovery of TF binding sites is usually carried out by searching for short DNA patterns that appear often within promoter regions of genes that are known to be co-bound by a TF. In such problems, promoters have traditionally been treated as strings of nucleotide bases in which TF binding sites are assumed to be equally likely to occur at any position. In vivo, however, TFs localize to DNA binding sites as part of a complicated thermodynamic process of cooperativity and competition, both with one another and, importantly, with DNA packaging proteins called nucleosomes. In particular, TFs are more likely to bind DNA at sites that are not occupied by nucleosomes. In this paper, we show that it is possible to incorporate knowledge of the nucleosome landscape across the genome to aid binding site discovery; indeed, our algorithm incorporating nucleosome occupancy information is significantly more accurate than conventional methods. We use our algorithm to generate a condition-dependent, nucleosome-guided map of binding sites for 55 TFs in yeast.Keywords
This publication has 39 references indexed in Scilit:
- Whole-genome comparison of Leu3 binding in vitro and in vivo reveals the importance of nucleosome occupancy in target site selectionGenome Research, 2006
- A genomic code for nucleosome positioningNature, 2006
- Regulation of Mating and Filamentation Genes by Two Distinct Ste12 Complexes in Saccharomyces cerevisiaeMolecular and Cellular Biology, 2006
- Evidence for nucleosome depletion at active regulatory regions genome-wideNature Genetics, 2004
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003
- An algorithm for finding protein–DNA binding sites with applications to chromatin- immunoprecipitation microarray experimentsNature Biotechnology, 2002
- Genome-Wide Location and Function of DNA Binding ProteinsScience, 2000
- Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitationNature Biotechnology, 1998
- Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies 1 1Edited by G. von HeijneJournal of Molecular Biology, 1998
- Is there left-handed DNA at the ends of yeast chromosomes?Nature, 1983