Markov Processes: Linguistics and Zipf's Law
- 29 May 1995
- journal article
- research article
- Published by American Physical Society (APS) in Physical Review Letters
- Vol. 74 (22) , 4559-4562
- https://doi.org/10.1103/physrevlett.74.4559
Abstract
It is shown that a 2-parameter random Markov process constructed with states and biased random transitions gives rise to a stationary distribution where the probabilities of occurrence of the states, , exhibit the following three universal behaviors which characterize biological sequences and texts in natural languages: (a) the rank-ordered frequencies of occurrence of words are given by Zipf's law , where is slowly increasing for small ; (b) the frequencies of occurrence of letters are given by ; and (c) long-range correlations are observed over long but finite intervals, as a result of the quasiergodicity of the Markov process.
Keywords
This publication has 9 references indexed in Scilit:
- Linguistic Features of Noncoding DNA SequencesPhysical Review Letters, 1994
- Entropy and Long-Range Correlations in Literary EnglishEurophysics Letters, 1994
- Generalized Lévy-walk model for DNA nucleotide sequencesPhysical Review E, 1993
- LONG RANGE CORRELATION IN HUMAN WRITINGSFractals, 1993
- Long-range correlations in nucleotide sequencesNature, 1992
- A General Rule for Ranged Series of Codon Frequencies in Different GenomesJournal of Biomolecular Structure and Dynamics, 1989
- Fractal Time in Condensed MatterAnnual Review of Physical Chemistry, 1988
- Inhomogeneous Magnetization in Dilute Asymmetric and Symmetric SystemsPhysical Review Letters, 1988
- An Exactly Solvable Asymmetric Neural Network ModelEurophysics Letters, 1987