Degraded word recognition based on segmental signal-to-noise ratio weighting

Abstract
Distance measures robust against noise disturbances are required for reliable recognition of noisy speech. The local signal-to-noise ratio (SNR) of degraded speech varies in a wide range and the characteristics of speech with low SNR tend to be lost. Pattern matching, however, is performed uniformly without taking the local SNR of each analysis frame into account. The behavior of representative LPC distance measures versus segmental SNR is investigated, which shows the necessity of accounting for the effect of the segmental SNR on the distance measure. A double autocorrelation analysis is proposed as a spectrum estimation method. A pattern matching method is also introduced in which the segmental SNR is taken into account as a weight. Experiments of isolated word recognition were performed. The results show the effectiveness of the proposed method.

This publication has 2 references indexed in Scilit: