Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences
- 1 August 1980
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Acoustics, Speech, and Signal Processing
- Vol. 28 (4) , 357-366
- https://doi.org/10.1109/tassp.1980.1163420
Abstract
Several parametric representations of the acoustic signal were compared with regard to word recognition performance in a syllable-oriented continuous speech recognition system. The vocabulary included many phonetically similar monosyllabic words, therefore the emphasis was on the ability to retain phonetically significant acoustic information in the face of syntactic and duration variations. For each parameter set (based on a mel-frequency cepstrum, a linear frequency cepstrum, a linear prediction cepstrum, a linear prediction spectrum, or a set of reflection coefficients), word templates were generated using an efficient dynamic warping method, and test data were time registered with the templates. A set of ten mel-frequency cepstrum coefficients computed every 6.4 ms resulted in the best performance, namely 96.5 percent and 95.0 percent recognition with each of two speakers. The superior performance of the mel-frequency cepstrum coefficients may be attributed to the fact that they better represent the perceptually relevant aspects of the short-term speech spectrum.Keywords
This publication has 15 references indexed in Scilit:
- Order dependence in templates for monosyllabic word identificationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Recognition of monosyllabic words in continuous sentences using composite word templatesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Considerations in dynamic time warping algorithms for discrete word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1978
- On creating reference templates for speaker independent recognition of isolated wordsIEEE Transactions on Acoustics, Speech, and Signal Processing, 1978
- Distance measures for speech processingIEEE Transactions on Acoustics, Speech, and Signal Processing, 1976
- Linear Prediction of SpeechPublished by Springer Nature ,1976
- Automatic segmentation of speech into syllabic unitsThe Journal of the Acoustical Society of America, 1975
- Minimum prediction residual principle applied to speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1975
- Direct estimation of the vocal tract shape by inverse filtering of acoustic speech waveformsIEEE Transactions on Audio and Electroacoustics, 1973
- Automatic recognition of 200 wordsInternational Journal of Man-Machine Studies, 1970