Codon preference and primary sequence structure in protein-coding regions
- 1 January 1989
- journal article
- Published by Springer Nature in Bulletin of Mathematical Biology
- Vol. 51 (1) , 95-115
- https://doi.org/10.1007/bf02458838
Abstract
The stochastic complexity of a data base of 365 protein-coding regions is analysed. When the primary sequence is modeled as a spatially homogeneous Markov source, the fit to observed codon preference is very poor. The situation improves substantially when a non-homogeneous model is used. Some implications for the estimation of species phylogeny and substitution rates are discussed.Keywords
This publication has 29 references indexed in Scilit:
- A measure of the similarity of sets of sequences not requiring sequence alignment.Proceedings of the National Academy of Sciences, 1986
- Codon usage tabulated from the GenBank genetic sequence dataNucleic Acids Research, 1986
- Codon usage and genome compositionJournal of Molecular Evolution, 1985
- The Mosaic Genome of Warm-Blooded VertebratesScience, 1985
- Markov chain analysis finds a significant influence of neighboring bases on the occurrence of a base in eucaryotic nuclear DNA sequences both protein-coding and noncodingJournal of Molecular Evolution, 1985
- The Neutral Theory of Molecular EvolutionPublished by Cambridge University Press (CUP) ,1983
- A Markov analysis of DNA sequencesJournal of Theoretical Biology, 1983
- Statistical Inference of PhylogeniesJournal of the Royal Statistical Society. Series A (General), 1983
- A search for patterns in the nucleotide sequence of the MS2 genomeJournal of Mathematical Biology, 1979
- Statistical Inference Regarding Markov Chain ModelsJournal of the Royal Statistical Society Series C: Applied Statistics, 1973