Statistical methods for DNA sequence segmentation
Open Access
- 1 May 1998
- journal article
- Published by Institute of Mathematical Statistics in Statistical Science
- Vol. 13 (2) , 142-162
- https://doi.org/10.1214/ss/1028905933
Abstract
This article examines methods, issues and controversies that have arisen over the last decade in the effort to organize sequences of DNA base information into homogeneous segments. An array of different models and techniques have been considered and applied. We demonstrate that most approaches can be embedded into a suitable version of the multiple change-point problem, and we review the various methods in this light. We also propose and discuss a promising local segmentation method, namely, the application of split local polynomial fitting. The genome of bacteriophage $\lambda$ serves as an example sequence throughout the paper.
Keywords
This publication has 66 references indexed in Scilit:
- Two-stage change-point estimators in smooth regression modelsStatistics & Probability Letters, 1997
- Reversible jump Markov chain Monte Carlo computation and Bayesian model determinationBiometrika, 1995
- Bayesian Models for Multiple Local Sequence Alignment and Gibbs Sampling StrategiesJournal of the American Statistical Association, 1995
- Hidden Markov Models in Computational BiologyJournal of Molecular Biology, 1994
- Correlations in intronless DNANature, 1992
- Uncorrelated DNA walksNature, 1992
- Partition modelsCommunications in Statistics - Theory and Methods, 1990
- A tutorial on hidden Markov models and selected applications in speech recognitionProceedings of the IEEE, 1989
- Nonparametric statistical procedures for the changepoint problemJournal of Statistical Planning and Inference, 1984
- Theoretical models for heterogeneity of base composition in DNAJournal of Theoretical Biology, 1974