An improved search algorithm using incremental knowledge for continuous speech recognition

Abstract
A search algorithm that incrementally makes effective use of detailed sources of knowledge is proposed. The algorithm incrementally applies all available acoustic and linguistic information in three search phases. Phase one is a left-to-right Viterbi beam search that produces word end times and scores using right context between-word models with a bigram language model. Phase two, guided by results from phase one, is a right-to-left Viterbi beam search that produces word begin times and scores based on left context between-word models. Phase three is an A* search that combines the results of phases one and two with a long-distance language model. The objective is to maximize the recognition accuracy with a minimal increase in computational cost. With the decomposed, incremental, search algorithm, it is shown that early use of detailed acoustic models can significantly reduce the recognition error rate with a negligible increase in computational cost. It is demonstrated that the early use of detailed knowledge can improve the word error bound by at least 22% for large-vocabulary, speaker-independent, continuous speech recognition.

This publication has 8 references indexed in Scilit: