From HMM's to segment models: a unified view of stochastic modeling for speech recognition
- 1 September 1996
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Speech and Audio Processing
- Vol. 4 (5) , 360-378
- https://doi.org/10.1109/89.536930
Abstract
Many alternative models have been proposed to address some of the shortcomings of the hidden Markov model (HMM), which is currently the most popular approach to speech recognition. In particular, a variety of models that could be broadly classified as segment models have been described for representing a variable-length sequence of observation vectors in speech recognition applications. Since there are many aspects in common between these approaches, including the general recognition and training problems, it is useful to consider them in a unified framework. The paper describes a general stochastic model that encompasses most of the models proposed in the literature, pointing out similarities of the models in terms of correlation and parameter tying assumptions, and drawing analogies between segment models and HMMs. In addition, we summarize experimental results assessing different modeling assumptions and point out remaining open questions.Keywords
This publication has 70 references indexed in Scilit:
- A dynamical system model for generating fundamental frequency for speech synthesisIEEE Transactions on Speech and Audio Processing, 1999
- Analysis of the correlation structure for a neural predictive model with application to speech recognitionNeural Networks, 1994
- Maximum likelihood clustering of Gaussians for speech recognitionIEEE Transactions on Speech and Audio Processing, 1994
- Connectionist probability estimators in HMM speech recognitionIEEE Transactions on Speech and Audio Processing, 1994
- Automatic labeling of prosodic patternsIEEE Transactions on Speech and Audio Processing, 1994
- A hybrid segmental neural net/hidden Markov model system for continuous speech recognitionIEEE Transactions on Speech and Audio Processing, 1994
- Context modeling with the stochastic segment modelIEEE Transactions on Signal Processing, 1992
- Fast algorithms for phone classification and recognition using segment-based modelsIEEE Transactions on Signal Processing, 1992
- Neural Network Classifiers Estimate Bayesian a posteriori ProbabilitiesNeural Computation, 1991
- A tutorial on hidden Markov models and selected applications in speech recognitionProceedings of the IEEE, 1989