Transform representation of the spectra of acoustic speech segments with applications. I. General approach and application to speech recognition
- 1 April 1993
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Speech and Audio Processing
- Vol. 1 (2) , 180-195
- https://doi.org/10.1109/89.222877
Abstract
We present in this series of two papers a new approach for modeling and capturing the time-varying structure of the spectral envelope of speech. In this approach, we use an acoustic subword decomposition and the Karhunen-Loeve transform (UT) to extract and efficiently represent the highly correlated structure of the spectral envelope. Integration of the UT with acoustic subword modeling is a novel approach that concisely represents both steady-state and dynamic features of the spectra in a unified framework that very effectively captures acoustic-phonetic patterns. The organization of these two papers is as follows: the first paper, Part I presents the physiological and perceptual basis for the approach, the frame-based and acoustic-subword-based spectral representation, and applications to speaker-dependent recognition. The performance of the recognition algorithm based on this approach compares favorably to other existing techniques. Part II will present a frequency-domain coding technique by analysis/synthesis. This application of the new method produces good quality speech at low bit rates.This publication has 18 references indexed in Scilit:
- Automatic recognition of syllabic speech segments using spectral and temporal featuresPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Discrete utterance speech recognition without time alignmentPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Characterization of spectral transitions with applications to acoustic sub-word segmentation and automatic speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Robust LPC analysis and synthesis using the KL transformation of acoustic subwords spectraPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Automatic speech recognition using acoustic sub-words and no time alignmentPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Frame-specific statistical features for speaker independent speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1986
- Discrete utterance speech recognition without time alignmentIEEE Transactions on Information Theory, 1983
- Segmental durations in connected speech signals: Preliminary resultsThe Journal of the Acoustical Society of America, 1982
- Speech Analysis Synthesis and PerceptionPublished by Springer Nature ,1972
- Critical BandsPublished by Elsevier ,1970