Correlations in binary sequences and a generalized Zipf analysis

Abstract
We investigate correlated binary sequences using an n-tuple Zipf analysis, where we define ‘‘words’’ as strings of length n, and calculate the normalized frequency of occurrence ω(R) of ‘‘words’’ as a function of the word rank R. We analyze sequences with short-range Markovian correlations, as well as those with long-range correlations generated by three different methods: inverse Fourier transformation, Lévy walks, and the expansion-modification system. We study the relation between the exponent α characterizing long-range correlations and the exponent ζ characterizing power-law behavior in the Zipf plot. We also introduce a function P(ω), the frequency density, which is related to the inverse Zipf function R(ω), and find a simple relationship between ζ and ψ, where ω(R)∼Rζ and P(ω)∼ωψ. Further, for Markovian sequences, we derive an approximate form for P(ω). Finally, we study the effect of a coarse-graining ‘‘renormalization’’ on sequences with Markovian and with long-range correlations.