Acoustic Markov models used in the Tangora speech recognition system

Abstract
The Speech Recognition Group at IBM Research has developed a real-time, isolated-word speech recognizer called Tangora, which accepts natural English sentences drawn from a vocabulary of 20000 words. Despite its large vocabulary, the Tangora recognizer requires only about 20 minutes of speech from each new user for training purposes. The accuracy of the system and its ease of training are largely attributable to the use of hidden Markov models in its acoustic match component. An automatic technique for constructing Markov word models is described and results are included of experiments with speaker-dependent and speaker-independent models on several isolated-word recognition tasks Author(s) Bahl, L.R. IBM Thomas J. Watson Res. Center, Yorktown Heights, NY Brown, P.F. ; de Souza, P.V. ; Picheny, M.A.

This publication has 17 references indexed in Scilit: