Genome wide analysis of Arabidopsis core promoters
Open Access
- 25 February 2005
- journal article
- Published by Springer Nature in BMC Genomics
- Vol. 6 (1) , 25
- https://doi.org/10.1186/1471-2164-6-25
Abstract
Background: Core promoters are the gene regulatory regions most proximal to the transcription start site (TSS), central to the formation of pre-initiation complexes and for combinatorial gene regulation. The DNA elements required for core promoter function in plants are poorly understood. To establish the sequence motifs that characterize plant core promoters and to compare them to the corresponding sequences in animals, we took advantage of available full-length cDNAs (FL-cDNAs) and predicted upstream regulatory sequences to carry out the analysis of 12,749 Arabidopsis core promoters. Results: Using a combination of expectation maximization and Gibbs sampling methods, we identified several motifs overrepresented in Arabidopsis core promoters. One of them corresponded to the TATA element, for which an in-depth analysis resulted in the generation of robust TATA Nucleotide Frequency Matrices (NFMs) capable of predicting Arabidopsis TATA elements with a high degree of confidence. We established that approximately 29% of all Arabidopsis promoters contain TATA motifs, clustered around position -32 with respect to the TSS. The presence of TATA elements was associated with genes represented more frequently in EST collections and with shorter 5' UTRs. No cis-elements were found over-represented in TATA-less, compared to TATA-containing promoters. Conclusion: Our studies provide a first genome-wide illustration of the composition and structure of core Arabidopsis promoters. The percentage of TATA-containing promoters is much lower than commonly recognized, yet comparable to the number of Drosophila promoters containing a TATA element. Although several other DNA elements were identified as over-represented in Arabidopsis promoters, they are present in only a small fraction of the genes and they represent elements not previously described in animals, suggesting a distinct architecture of the core promoters of plant and animal genes.Keywords
This publication has 33 references indexed in Scilit:
- WebLogo: A Sequence Logo Generator: Figure 1Genome Research, 2004
- Improving the Arabidopsis genome annotation using maximal transcript alignment assembliesNucleic Acids Research, 2003
- The RNA Polymerase II Core PromoterAnnual Review of Biochemistry, 2003
- The RNA polymerase II core promoter: a key component in the regulation of gene expressionGenes & Development, 2002
- Functional Annotation of a Full-Length Arabidopsis cDNA CollectionScience, 2002
- Photosynthesis nuclear genes generally lack TATA‐boxes: a tobacco photosystem I gene responds to light through an initiatorThe Plant Journal, 2002
- Core promoters: active contributors to combinatorial gene regulation: Figure 1.Genes & Development, 2001
- Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitationNature Biotechnology, 1998
- Isolation of a higher eukaryotic telomere from Arabidopsis thalianaPublished by Elsevier ,1988
- At least six nucleotides preceding the AUG initiator codon enhance translation in mammalian cellsJournal of Molecular Biology, 1987