Auditory distortion measure for speech coding
- 1 January 1991
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 493-496 vol.1
- https://doi.org/10.1109/icassp.1991.150384
Abstract
A novel perceptually motivated objective measure for estimating the subjective quality of coded speech is presented. It takes into account auditory frequency warping (Bark transformation), critical-band integration, amplitude sensitivity variations with frequency, and conversion from loudness level to loudness. For each 10 ms segment of an utterance, a weighted spectral vector is computed via 15 critical band filters. The overall distortion, called Bark spectral distortion (BSD), is the average squared Euclidean distance between spectral vectors of the original and coded utterance. In tests with speech distorted by a modulated noise reference unit or coded at rates of 2.4-64 kb/s, the measure predicted mean opinion score (MOS) ratings are notably better than segmental SNR. The standard error in estimating MOS scores with the new measure was 0.2-0.3.Keywords
This publication has 7 references indexed in Scilit:
- Phonetically-based vector excitation coding of speech at 3.6 kbpsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Perceptual linear predictive (PLP) analysis of speechThe Journal of the Acoustical Society of America, 1990
- Objective quality evaluation for low-bit-rate speech coding systemsIEEE Journal on Selected Areas in Communications, 1988
- Improved 1-Bark bandwidth auditory filterThe Journal of the Acoustical Society of America, 1984
- Subjective quality of the same speech transmission conditions in seven different countriesIEEE Transactions on Communications, 1982
- Modeling the judgment of vowel quality differencesThe Journal of the Acoustical Society of America, 1981
- A re-determination of the equal-loudness relations for pure tonesBritish Journal of Applied Physics, 1956