A set of viral DNA decamers enriched in transcription control signals
- 1 January 1991
- journal article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 19 (13) , 3733-3740
- https://doi.org/10.1093/nar/19.13.3733
Abstract
We studied the frequency distribution of oligonucleotides 10 bp long in a sample of 620 Kb of viral genomes, containing 102 sequences from GenBank, with the aim of detecting transcription control signals. Two thousand three hundred decamers had a frequency 10 times higher than the mean and were subjected to further statistical analysis. For each of the 2300 decamers (parents), we counted the individual frequencies of the 30 decamers differing from the parent by one base mutation (progeny) and then calculated two variance/mean chi squares for the progeny, with and without the parent. We then studied the distribution of the ratio between the two chi squares. Out of 2300 decamers, 10 times more frequent than average, 479 decamers had a chi square ratio of 1.9 or larger. In this final set, which corresponds to less than 0.05% of all possible decamers, 58 decamers were found to contain viral and eukaryotic transcription control elements, like NF-kB, Sp1 and others. Furthermore, this set contains an excess of signals of length 5, 6, 7, 8, 9 and 10, when compared to 150 random sets, bootstrapped from the same viral genomes.Keywords
This publication has 19 references indexed in Scilit:
- Oligonucleotied corrrelations between infector and host genomes hint at evolutionary relationshipsNucleic Acids Research, 1990
- The frequency of oligonucleotides in mammalian genic regionsBioinformatics, 1989
- Enhancers and transcription factors in the control of gene expressionBiochimica et Biophysica Acta (BBA) - Gene Structure and Expression, 1988
- Co-localization of rare oligonucleotides and regulatory elements in mammalian upstream gene regionsJournal of Molecular Biology, 1988
- Compilation of transcription regulating proteinsNucleic Acids Research, 1988
- Activation of the AIDS Retrovirus Promoter by the Cellular Transcription Factor, Sp1Science, 1986
- Heuristic informational analysis of sequencesNucleic Acids Research, 1986
- Comparison of the transcriptional properties of the friend and moloney retrovirus long terminal repeats: Importance of tandem duplications and of the core enhancer sequenceVirology, 1985
- Kpn I family of long-dispersed repeated DNA sequences of man: evidence for entry into genomic DNA of DNA copies of poly(A)-terminated Kpn I RNAs.Proceedings of the National Academy of Sciences, 1983
- Codon catalog usage is a genome strategy modulated for gene expressivityNucleic Acids Research, 1981