A comparison of several acoustic representations for speech recognition with degraded and undegraded speech
- 13 January 2003
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 15206149,p. 262-265
- https://doi.org/10.1109/icassp.1989.266415
Abstract
Several acoustic representations have been compared in speaker-dependent and independent connected and isolated-word recognition tests with undegraded speech and with speech degraded by adding white noise and by applying a 6-dB/octave spectral tilt. The representations comprised the output of an auditory model, cepstrum coefficients derived from an FFT-based mel-scale filter bank with various weighting schemes applied to the coefficients, cepstrum coefficients augmented with measures of their rates of change with time, and sets of linear discriminant functions derived from the filter-bank output and called IMELDA. The model outperformed the cepstrum representations except in noise-free connected-word tests, where it had a high insertion rate. The best cepstrum weighting scheme was derived from within-class variances. Its behavior may explain the empirical adjustments found necessary with other schemes. IMELDA outperformed all other representations in all conditions and is computationally simple.Keywords
This publication has 13 references indexed in Scilit:
- Speech recognition using an auditory model with pitch-synchronous analysisPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- On the use of bandpass liftering in speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Spectral slope based distortion measures for all-pole models of speechPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Noise compensation for speech recognition using probabilistic modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- A weighted cepstral distance measure for speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Distance measure for speech recognition based on the smoothed group delay spectrumPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- High performance connected digit recognition, using hidden Markov modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Spectral movement function and its application to speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- On the performance of the quefrency-weighted cepstral coefficients in vowel recognitionSpeech Communication, 1982
- A statistical approach to metrics for word and syllable recognitionThe Journal of the Acoustical Society of America, 1979