SPEECH CODING: RECOGNIZING WHAT WE DO NOT HEAR IN SPEECH
- 1 June 1983
- journal article
- research article
- Published by Wiley in Annals of the New York Academy of Sciences
- Vol. 405 (1) , 18-32
- https://doi.org/10.1111/j.1749-6632.1983.tb31614.x
Abstract
Speech is a highly redundant signal. The redundant nature of speech is important for providing reliable communication over air pathways. A large part of this redundancy is useless for speech communication over digital channels. Speech coding aims at minimizing the information rate needed to reproduce a speech signal with specified fidelity. In this paper, we discuss factors that influence the design of efficient speech coders. The encoding and decoding processes invariably introduce error (noise and distortion) in the speech signal. Inability of the human ear to hear certain kinds of distortions in the speech signal plays a crucial role in producing high-quality speech at low bit rates. The physical difference between the waveforms of a given speech signal and its coded replica generally does not tell us much about the subjective quality of the coded signal. A signal-to-noise ratio as small as 10 dB can be tolerated in the coded signal provided the errors are distributed both in time and frequency domains where they are least audible. Recent work on auditory masking has provided us with new insights for optimizing the performance of speech coders. This paper reviews this work and discusses new speech coding methods that attempt to maximize the perceptual similarity between the original speech signal and its coded replica. These new methods make it possible to reproduce speech signals at very low bit rates with little or no audible distortion.Keywords
This publication has 9 references indexed in Scilit:
- Optimizing predictive coders for minimum audible noisePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- A new model of LPC excitation for producing natural-sounding speech at low bit ratesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Predictive Coding of Speech at Low Bit RatesIEEE Transactions on Communications, 1982
- Optimizing digital speech coders by exploiting masking properties of the human earThe Journal of the Acoustical Society of America, 1979
- Speech CodingIEEE Transactions on Communications, 1979
- Asymmetry of masking between noise and tonePerception & Psychophysics, 1972
- Speech Analysis Synthesis and PerceptionPublished by Springer Nature ,1972
- Speech Analysis and Synthesis by Linear Prediction of the Speech WaveThe Journal of the Acoustical Society of America, 1971
- Instantaneous companding of quantized signalsBell System Technical Journal, 1957