SPEECH CODING: RECOGNIZING WHAT WE DO NOT HEAR IN SPEECH

1 June 1983

journal article
research article
Published by Wiley in Annals of the New York Academy of Sciences

Vol. 405 (1) , 18-32
https://doi.org/10.1111/j.1749-6632.1983.tb31614.x

Abstract

Speech is a highly redundant signal. The redundant nature of speech is important for providing reliable communication over air pathways. A large part of this redundancy is useless for speech communication over digital channels. Speech coding aims at minimizing the information rate needed to reproduce a speech signal with specified fidelity. In this paper, we discuss factors that influence the design of efficient speech coders. The encoding and decoding processes invariably introduce error (noise and distortion) in the speech signal. Inability of the human ear to hear certain kinds of distortions in the speech signal plays a crucial role in producing high-quality speech at low bit rates. The physical difference between the waveforms of a given speech signal and its coded replica generally does not tell us much about the subjective quality of the coded signal. A signal-to-noise ratio as small as 10 dB can be tolerated in the coded signal provided the errors are distributed both in time and frequency domains where they are least audible. Recent work on auditory masking has provided us with new insights for optimizing the performance of speech coders. This paper reviews this work and discusses new speech coding methods that attempt to maximize the perceptual similarity between the original speech signal and its coded replica. These new methods make it possible to reproduce speech signals at very low bit rates with little or no audible distortion.

Keywords

This publication has 9 references indexed in Scilit:

Optimizing predictive coders for minimum audible noise
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
A new model of LPC excitation for producing natural-sounding speech at low bit rates
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Predictive Coding of Speech at Low Bit Rates
IEEE Transactions on Communications, 1982
Optimizing digital speech coders by exploiting masking properties of the human ear
The Journal of the Acoustical Society of America, 1979
Speech Coding
IEEE Transactions on Communications, 1979
Asymmetry of masking between noise and tone
Perception & Psychophysics, 1972
Speech Analysis Synthesis and Perception
Published by Springer Nature ,1972
Speech Analysis and Synthesis by Linear Prediction of the Speech Wave
The Journal of the Acoustical Society of America, 1971
Instantaneous companding of quantized signals
Bell System Technical Journal, 1957