Abstract
Using the maximum likelihood technique an algorithm is developed for the extraction of pitch for speech that has been corrupted by additive noise. The speech model includes the effects of pitch periodicity and the spectral envelope which results in a processing structure that consists of a noise suppression prefilter in cascade with a comb filter bank estimator-correlator. The prefilter attenuates those frequency bands where the speech signal-to-noise ratio is low, hence most of the deleterious noise is rejected prior to the determination of pitch by the comb filter bank correlator. The comb filter interpretation leads to an implementation of the correlation function which avoids the problem of anomalous pitch errors due to the effects of windowing and formant sidelobe interaction which obviates the need for any type of spectral flattening. Pitch ambiguities are resolved using a majority logic scoring algorithm and a carefully designed pitch tracker that can adapt rapidly to gross pitch variations. The voiced/unvoiced decision is based on an adaptive minimum energy threshold, a high/low band energy measurement, a normalized pitch correlation coefficient and a pitch track continuity coefficient. A time domain implementation of the algorithm that runs in real time in conjunction with an LPC analysis/synthesis system at 2400 bps is described. (Author)

This publication has 0 references indexed in Scilit: