Modeling acoustic transitions in speech by state-interpolation hidden Markov models
- 1 January 1992
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Signal Processing
- Vol. 40 (2) , 265-271
- https://doi.org/10.1109/78.124937
Abstract
The authors present a new type of hidden Markov model (HMM) for vowel-to-consonant (VC) and consonant-to-vowel (CV) transitions based on the locus theory of speech perception. The parameters of the model can be trained automatically using the Baum-Welch algorithm and the training procedure does not require that instances of all possible CV and VC pairs be present. When incorporated into an isolated word recognizer with a 75000 word vocabulary it leads to the modest improvement in recognition rates. The authors give recognition results for the state interpolation HMM and compare them to those obtained by standard context-independent HMMs and generalized triphone modelsKeywords
This publication has 11 references indexed in Scilit:
- BYBLOS: The BBN continuous speech recognition systemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Context-dependent phonetic Markov models for large vocabulary speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Speaker stress-resistant continuous speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Acoustic Markov models used in the Tangora speech recognition systemPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- A segment model based approach to speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Large vocabulary word recognition using context-dependent allophonic hidden Markov modelsComputer Speech & Language, 1990
- Fast search strategy in a large vocabulary word recognizerThe Journal of the Acoustical Society of America, 1988
- Maximum likelihood estimation for multivariate observations of Markov sourcesIEEE Transactions on Information Theory, 1982
- Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentencesIEEE Transactions on Acoustics, Speech, and Signal Processing, 1980
- Acoustic Loci and Transitional Cues for ConsonantsThe Journal of the Acoustical Society of America, 1955