Comparative Study of Nonlinear Time Warping Techniques in Isolated Word Speech Recognition Systems
- 17 June 1981
- report
- Published by Defense Technical Information Center (DTIC)
Abstract
In this paper we present the description of an isolated word recognition system and a discussion of various design choices that affect its performance. In particular, we report experimental results aimed at evaluating several methods to optimize the performance of dynamic warping algorithms. Three major aspects that have been suggested have been investigated: relaxation of the boundary conditions to allow for inaccurate begin-end time detection; choice of warping algorithm, e.g., Itakura asymmetric, Sakoe and Chiba symmetric, Sakoe and Chiba asymmetric; and choice of an appropriate warping window to restrict computation to a minimum needed for best recognition results. Recognition results were tested on two vocabularies: the digits and a highly confusable subset of the alphabet (e.g., e, b, d, p, t, g, v, c, z). The relaxation of the boundary conditions degraded the performance of the confusable subset and the digits. The asymmetric Itakura algorithm yielded better results for the confusables, while we obtained slightly better results for the digits using the symmetric Sakoe and Chiba algorithm. The choice of a 100-ms warping window appears to be optimal for both vocabularies used.Keywords
This publication has 0 references indexed in Scilit: