The effect of codon usage on the oligonucleotide composition of the E.coli genome and identification of over-and underepresented sequences by Markow chain analysis
Open Access
- 25 March 1987
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 15 (6) , 2627-2638
- https://doi.org/10.1093/nar/15.6.2627
Abstract
As shown in the accompanying paper (5), the oligonudeotide composition of the E. coli genome is highly asymmetric for sequences up to 6 bp in length when ranked from highest to lowest abundance. We show here that this largely reflects codon usage because heavily used codons were found in the highly abundant oligomers whereas rarely used codons, with some exceptions, occurred in sequences in low abundance. Furthermore, linear regression analysis revealed a strong correlation between the frequencies of each trinucleotide and its usage as a codon. Dinucleotides are also not randoo-ly distributed across each codon position and the dinucleotide composition of genes that are transcribed but not translated (rRNA and tRNA genes) was highly related to that seen in genes encoding polypeptides. However, 45 tetra-, 8 penta-, and 6 hexanucleotides were significantly over- or underabundant by Markov chain analysis and could not be accounted for by codon usage. Of these underrepresented sequences, many were palindromes, including the Dam methylation siteKeywords
This publication has 16 references indexed in Scilit:
- A comprehensive package for DNA sequence analysis in FORTRAN IV for the PDP-11Nucleic Acids Research, 1986
- Molecular evolution of bacteriophages: evidence of selection against the recognition sites of host restriction enzymes.Molecular Biology and Evolution, 1986
- Transcriptional block caused by a negative supercoiling induced structural change in an alternating CG sequenceCell, 1984
- Strong doublet preferences in nucleotide sequences and DNA geometryJournal of Molecular Evolution, 1984
- Viability of λ phages carrying a perfect palindrome in the absence of recombination nucleasesNature, 1983
- Contextual constraints on synonymous codon choiceJournal of Molecular Biology, 1983
- Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genes: A proposal for a synonymous codon choice that is optimal for the E. coli translational systemJournal of Molecular Biology, 1981
- Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification.Proceedings of the National Academy of Sciences, 1981
- Correlation between the abundance of Escherichia coli transfer RNAs and the occurrence of the respective codons in its protein genesJournal of Molecular Biology, 1981
- Codon catalog usage is a genome strategy modulated for gene expressivityNucleic Acids Research, 1981