Counts of long aligned word matches among random letter sequences
- 1 June 1987
- journal article
- Published by Cambridge University Press (CUP) in Advances in Applied Probability
- Vol. 19 (2) , 293-351
- https://doi.org/10.2307/1427422
Abstract
Asymptotic distributional properties of the maximal length aligned word (a contiguous set of letters) among multiple random Markov dependent sequences composed of letters from a finite alphabet are given. For sequences of length N, Cr,s(N) defined as the longest common aligned word found in r or more of s sequences has order growth log N/(–logλ) where λis the maximal eigenvalue of r-Schur product matrices from among the collections of Markov matrices that generate the sequences. The count Z∗r,s(N, k) of positions that initiate an aligned match of length exceeding k = log N/(–logλ) + x but fail to match at the immediately preceding position has a limiting Poisson distribution. Distributional properties of other long aligned word relationships and patterns are also discussed.Keywords
This publication has 15 references indexed in Scilit:
- An extreme value theory for long head runsProbability Theory and Related Fields, 1986
- Critical Phenomena in Sequence MatchingThe Annals of Probability, 1985
- Some monotonicity properties of Schur powers of matrices and related inequalitiesLinear Algebra and its Applications, 1985
- The first repetition of a pattern in a symmetric Bernoulli sequenceJournal of Applied Probability, 1983
- Long Common Subsequences and the Proximity of Two Random StringsSIAM Journal on Applied Mathematics, 1982
- Long repetitive patterns in random sequencesProbability Theory and Related Fields, 1980
- Some limit results for longest common subsequencesDiscrete Mathematics, 1979
- Longest common subsequences of two random sequencesJournal of Applied Probability, 1975
- Limit Distribution of Random Variables Associated with Multiple Long Duplications in a Sequence of Independent TrialsTheory of Probability and Its Applications, 1974
- Limit Distributions of Random Variables Associated with Long Duplications in a Sequence of Independent TrialsTheory of Probability and Its Applications, 1974