Computation-Based Discovery of Related Transcriptional Regulatory Modules and Motifs Using an Experimentally Validated Combinatorial Model

Abstract
We report here the identification of a previously unknown transcription regulatory element for heat shock (HS) genes inCaenorhabditis elegans. We monitored the expression pattern of 11,917 genes from C. elegans to determine the genes that were up-regulated on HS. Twenty eight genes were observed to be consistently up-regulated in several different repetitions of the experiments. We analyzed the upstream regions of these genes using computational DNA pattern recognition methods. Two potential cis-regulatory motifs were identified in this way. One of these motifs (TTCTAGAA) was the DNA binding motif for the heat shock factor (HSF), whereas the other (GGGTGTC) was previously unreported in the literature. We determined the significance of these motifs for the HS genes using different statistical tests and parameters. Comparative sequence analysis of orthologous HS genes from C. elegans andCaenorhabditis briggsae indicated that the identified DNA regulatory motifs are conserved across related species. The role of the identified DNA sites in regulation of HS genes was tested by in vitro mutagenesis of a green fluorescent protein (GFP) reporter transgene driven by the C. elegans hsp-162promoter. DNA sites corresponding to both motifs are shown to play a significant role in up-regulation of the hsp-162gene on HS. This is one of the rare instances in which a novel regulatory element, identified using computational methods, is shown to be biologically active. The contributions of individual sites toward induction of transcription on HS are nonadditive, which indicates interaction and cross-talk between the sites, possibly through the transcription factors (TFs) binding to these sites. [The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: L. Hillier.]