Efficient search using posterior phone probability estimates

19 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

Vol. 1 (15206149) , 596-599
https://doi.org/10.1109/icassp.1995.479668

Abstract

We present a novel, efficient search strategy for large vocabulary continuous speech recognition (LVCSR). The search algorithm, based on stack decoding, uses posterior phone probability estimates to substantially increase its efficiency with minimal effect on accuracy. In particular, the search space is dramatically reduced by phone deactivation pruning where phones with a small local posterior probability are deactivated. This approach is particularly well-suited to hybrid connectionist/hidden Markov model systems because posterior phone probabilities are directly computed by the acoustic model. On large vocabulary tasks, using a trigram language model, this increased the search speed by an order of magnitude, with 2% or less relative search error. Results from a hybrid system are presented using the Wall Street Journal LVCSR database for a 20,000 word task using a backed-off trigram language model. For this task, our single-pass decoder took around 15x realtime on an HP735 workstation. At a cost of 7% relative search error, the decoding time can be speeded up to approximately realtime.

Keywords

This publication has 12 references indexed in Scilit:

Optimizing recognition and rejection performance in wordspotting systems
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A channel-bank-based phone detection strategy
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Improvements in beam search for 10000-word continuous-speech recognition
IEEE Transactions on Speech and Audio Processing, 1994
An application of recurrent nets to phone probability estimation
IEEE Transactions on Neural Networks, 1994
Connectionist probability estimators in HMM speech recognition
IEEE Transactions on Speech and Audio Processing, 1994
Connectionist Speech Recognition
Published by Springer Nature ,1994
A*-admissible heuristics for rapid lexical access
IEEE Transactions on Speech and Audio Processing, 1993
An efficient A* stack decoder algorithm for continuous speech recognition with a stochastic language model
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1992
Decoding for channels with insertions, deletions, and substitutions with applications to speech recognition
IEEE Transactions on Information Theory, 1975
Fast Sequential Decoding Algorithm Using a Stack
IBM Journal of Research and Development, 1969