Multiple neural network topologies applied to keyword spotting

Abstract
The authors describe several experiments in which the use of artificial neural networks (ANNs) for the continuous speech speaker-independent keyword recognition problem was investigated. They discuss methodologies for reducing a primary keyword spotting system's susceptibility to false alarms while maintaining recognition accuracy. The keyword spotter uses a conventional dynamic time warping algorithm to detect the start- and end-point of each potential keyword. The ANNs serve as a secondary processing stage for this segmented utterance. The ANNs attempt to classify this utterance by formulating the recognition problem as a pattern matching problem. In the hybrid network experiments, the utterance was processed into features derived from the activation at the hidden layer of a back-propagation trained network. Hybrid representations were grouped with two other feature representations in a multiple neural network system. A recognition accuracy of 78% on the Stonehenge X database was obtained while rejecting 72% of the false alarms which were detected by the primary keyword spotting system.

This publication has 6 references indexed in Scilit: