Using phoneme duration and energy contour information to improve large vocabulary isolated-word recognition
- 1 January 1991
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 15206149,p. 341-344 vol.1
- https://doi.org/10.1109/icassp.1991.150346
Abstract
Minimum duration constraints and energy thresholds for phonemes were used to increase the recognition accuracy of an 86000-word speaker-trained isolated word recognizer. Minimum duration constraints force the phoneme models to map to acoustic segments longer than the duration minima for the phonemes. Such constraints result in significant lowering of likelihoods of many incorrect word choices, improving the accuracy of acoustic recognition and recognition with the language model. The phoneme models were also improved by correcting the segmentation of the phonemes in the training set. During training, the boundaries between phonemes are not marked accurately. Energy is used to correct these boundaries. Application of an energy threshold improves the segment boundaries between stops and sonorants (vowels, liquids, and glides), between fricatives and sonorants, between affricates and sonorants. and between breath noise and sonorants. On two speakers, the overall reduction in errors using minimum durations and energy thresholds is from 27.3% to 23.1% for acoustic recognition and from 14.3% to 8.8% with the language model.Keywords
This publication has 7 references indexed in Scilit:
- A performance evaluation of a connected digit recognizerPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- A phonetically labeled acoustic segment (PLAS) approach to speech analysis-synthesisPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- A language model for very large-vocabulary speech recognitionComputer Speech & Language, 1992
- A dictionary for a very large vocabulary word recognition systemComputer Speech & Language, 1990
- A frame-synchronous network search algorithm for connected word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1989
- Fast search strategy in a large vocabulary word recognizerThe Journal of the Acoustical Society of America, 1988
- An Introduction to the Application of the Theory of Probabilistic Functions of a Markov Process to Automatic Speech RecognitionBell System Technical Journal, 1983