Abstract
This paper describes a computational model that attempts to separate two simultaneous talkers. The goal of this model is to improve a speech recognition system's ability to recognize what each of the two talkers say. The model consists of the following stages: (1) an iterative dynamic programming algorithm to track the pitch period for each of the two talkers, (2) a Markov model to determine the characteristics (e.g. voiced-unvoiced) of each speaker's voice, (3) a recursive algorithm that uses both local periodicity information and local spectral continuity constraints to compute a spectral estimate of each talker, (4) a resynthesis algorithm to convert the spectral estimate of each talker into a speech waveform, and (5) a speaker-independent continuous-digit-recognition system that attempts to recognize what each of two talkers is saying. The system was trained and tested on a database of simultaneous digit strings spoken by a male and female talker. An evaluation of the different stages of this model is presented.

This publication has 5 references indexed in Scilit: