Auditory distortion measure for speech coding

Abstract
A novel perceptually motivated objective measure for estimating the subjective quality of coded speech is presented. It takes into account auditory frequency warping (Bark transformation), critical-band integration, amplitude sensitivity variations with frequency, and conversion from loudness level to loudness. For each 10 ms segment of an utterance, a weighted spectral vector is computed via 15 critical band filters. The overall distortion, called Bark spectral distortion (BSD), is the average squared Euclidean distance between spectral vectors of the original and coded utterance. In tests with speech distorted by a modulated noise reference unit or coded at rates of 2.4-64 kb/s, the measure predicted mean opinion score (MOS) ratings are notably better than segmental SNR. The standard error in estimating MOS scores with the new measure was 0.2-0.3.

This publication has 7 references indexed in Scilit: