Stochastic models for heterogeneous DNA sequences
- 1 January 1989
- journal article
- research article
- Published by Springer Nature in Bulletin of Mathematical Biology
- Vol. 51 (1) , 79-94
- https://doi.org/10.1007/bf02458837
Abstract
The composition of naturally occurring DNA sequences is often strikingly heterogeneous. In this paper, the DNA sequence is viewed as a stochastic process with local compositional properties determined by the states of a hidden Markov chain. The model used is a discrete-state, discreteoutcome version of a general model for non-stationary time series proposed by Kitagawa (1987). A smoothing algorithm is described which can be used to reconstruct the hidden process and produce graphic displays of the compositional structure of a sequence. The problem of parameter estimation is approached using likelihood methods and an EM algorithm for approximating the maximum likelihood estimate is derived. The methods are applied to sequences from yeast mitochondrial DNA, human and mouse mitochondrial DNAs, a human X chromosomal fragment and the complete genome of bacteriophage lambda.This publication has 21 references indexed in Scilit:
- Compositional constraints and genome evolutionJournal of Molecular Evolution, 1986
- The Mosaic Genome of Warm-Blooded VertebratesScience, 1985
- Nucleotide sequence of bacteriophage λ DNAJournal of Molecular Biology, 1982
- Sequence and gene organization of mouse mitochondrial DNACell, 1981
- Sequence and organization of the human mitochondrial genomeNature, 1981
- Estimating the Dimension of a ModelThe Annals of Statistics, 1978
- Theoretical models for heterogeneity of base composition in DNAJournal of Theoretical Biology, 1974
- Inference about the change-point in a sequence of random variablesBiometrika, 1970
- A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov ChainsThe Annals of Mathematical Statistics, 1970
- Segmental distribution of nucleotides in the DNA of bacteriophage lambdaJournal of Molecular Biology, 1968