Development of an acoustic-phonetic hidden Markov model for continuous speech recognition

1 January 1991

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Signal Processing

Vol. 39 (1) , 29-39
https://doi.org/10.1109/78.80762

Abstract

The techniques used to develop an acoustic-phonetic hidden Markov model, the problems associated with representing the whole acoustic-phonetic structure, the characteristics of the model, and how it performs as a phonetic decoder for recognition of fluent speech are discussed. The continuous variable duration model was trained using 450 sentences of fluent speech, each of which was spoken by a single speaker, and segmented and labeled using a fixed number of phonemes, each of which has a direct correspondence to the states of the matrix. The inherent variability of each phoneme is modeled as the observable random process of the Markov chain, while the phonotactic model of the unobservable phonetic sequence is represented by the state transition matrix of the hidden Markov model. The model assumes that the observed spectral data were generated by a Gaussian source. However, an analysis of the data shows that the spectra for the most of the phonemes are not normally distributed and that an alternative representation would be beneficial

Keywords

This publication has 15 references indexed in Scilit:

Continuous speech recognition by means of acoustic/ Phonetic classification obtained from a hidden Markov model
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
BYBLOS: The BBN continuous speech recognition system
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
A weighted cepstral distance measure for speech recognition
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
A stochastic segment model for phoneme-based continuous speech recognition
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1989
Network-based connected digit recognition
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1987
On the role of spectral transition for speech perception
The Journal of the Acoustical Society of America, 1986
A Segmentalk-Means Training Procedure for Connected Word Recognition
AT&T Technical Journal, 1986
Continuously variable duration hidden Markov models for automatic speech recognition
Computer Speech & Language, 1986
Segmental durations in connected speech signals: Preliminary results
The Journal of the Acoustical Society of America, 1982
Error bounds for convolutional codes and an asymptotically optimum decoding algorithm
IEEE Transactions on Information Theory, 1967