On the Performance of Isolated Word Speech Recognizers Using Vector Quantization and Temporal Energy Contours

Abstract
In this paper we present results of a series of experiments in which combinations of vector quantization and temporal energy contours are incorporated into the standard framework for the word recognizer. We consider two distinct word vocabularies, namely, a set of 10 digits, and a 129-word airlines vocabulary. We show that the incorporation of energy leads to small but consistent improvements in performance for the digits vocabulary; the incorporation of vector quantization (in a judicious manner) leads to small degradation in performance for both vocabularies, but at the same time reduces overall computation of the recognizer by a significant amount. We conclude that a high-performance, moderate-computation, isolated word recognizer can be achieved using vector quantization and the temporal energy contour.