Abstract
The authors present single- and multispeaker recognition results for the voiced stop consonants /b, d, g/ using time-delay neural networks (TDNN), a new objective function for training these networks, and a simple arbitration scheme for improved classification accuracy. With these enhancements a median 24% reduction in the number of misclassifications made by TDNNs trained with the traditional backpropagation objective function is achieved. This redundant results in /b, d, g/ recognition rates that consistently exceed 98% for TDNNs trained with individual speakers; it yields a 98.1% recognition rate for a TDNN trained with three male speakers.

This publication has 4 references indexed in Scilit: