Analysis of combinatorial cis-regulation in synthetic and genomic promoters

Top Cited Papers
Open Access
Abstract
Despite countless catalogues of genomic binding sites for transcription factors, predicting the expression of a gene purely based on its DNA sequence has remained very inaccurate. Gertz et al. present a general thermodynamic model that accurately captures the relationship between a gene promoter sequence, including weak, statistically undetectable regulatory sites, and its expression output. The work implies that chromatin plays a relatively minor role in directing gene expression and will facilitate rational genetic design in biotechnology and synthetic biology. This paper presents a general thermodynamic model that accurately captures the relationship between a gene promoter sequence, including weak, stastically undetectable regulatory sites, and its expression output. The work implies a relatively minor role of chromatin and will facilitate rational genetic design in biotechnology and synthetic biology. Transcription factor binding sites are being discovered at a rapid pace1,2. It is now necessary to turn attention towards understanding how these sites work in combination to influence gene expression. Quantitative models that accurately predict gene expression from promoter sequence3,4,5 will be a crucial part of solving this problem. Here we present such a model, based on the analysis of synthetic promoter libraries in yeast (Saccharomyces cerevisiae). Thermodynamic models based only on the equilibrium binding of transcription factors to DNA and to each other captured a large fraction of the variation in expression in every library. Thermodynamic analysis of these libraries uncovered several phenomena in our system, including cooperativity and the effects of weak binding sites. When applied to the S. cerevisiae genome, a model of repression by Mig1 (which was trained on synthetic promoters) predicts a number of Mig1-regulated genes that lack significant Mig1-binding sites in their promoters. The success of the thermodynamic approach suggests that the information encoded by combinations of cis-regulatory sites is interpreted primarily through simple protein–DNA and protein–protein interactions, with complicated biochemical reactions—such as nucleosome modifications—being downstream events. Quantitative analyses of synthetic promoter libraries will be an important tool in unravelling the rules underlying combinatorial cis-regulation.