Codon choice in genes depends on flanking sequence information—implications for theoretical reverse translation
Open Access
- 18 January 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 36 (3) , e16
- https://doi.org/10.1093/nar/gkm1181
Abstract
Algorithms for theoretical reverse translation have direct applications in degenerate PCR. The conventional practice is to create several degenerate primers each of which variably encode the peptide region of interest. In the current work, for each codon we have analyzed the flanking residues in proteins and determined their influence on codon choice. From this, we created a method for theoretical reverse translation that includes information from flanking residues of the protein in question. Our method, named the neighbor correlation method (NCM) and its enhancement, the consensus-NCM (c-NCM) performed significantly better than the conventional codon-usage statistic method (CSM). Using the methods NCM and c-NCM, we were able to increase the average sequence identity from 77% up to 81%. Furthermore, we revealed a significant increase in coverage, at 80% identity, from < 20% (CSM) to > 75% (c-NCM). The algorithms, their applications and implications are discussed herein.Keywords
This publication has 43 references indexed in Scilit:
- Large Scale Comparative Codon-Pair Context Analysis Unveils General Rules that Fine-Tune Evolution of mRNA Primary StructurePLOS ONE, 2007
- The complete genome of the crenarchaeon Sulfolobus solfataricus P2Proceedings of the National Academy of Sciences, 2001
- Complete genome sequence of Caulobacter crescentusProceedings of the National Academy of Sciences, 2001
- Isolation of segments of homologous genes with only one conserved amino acid region via PCRNucleic Acids Research, 2001
- The complete sequence of the mucosal pathogen Ureaplasma urealyticumNature, 2000
- The genome sequence of the thermoacidophilic scavenger Thermoplasma acidophilumNature, 2000
- Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogenNature, 2000
- EMBOSS: The European Molecular Biology Open Software SuiteTrends in Genetics, 2000
- Complete Genome Structure of the Nitrogen-fixing Symbiotic Bacterium Mesorhizobium lotiDNA Research, 2000
- Genome Sequence of the Radioresistant Bacterium Deinococcus radiodurans R1Science, 1999