Automatic word duration rules for demisyllable‐based isolated word recognition
- 1 November 1981
- journal article
- abstracts
- Published by Acoustical Society of America (ASA) in The Journal of the Acoustical Society of America
- Vol. 70 (S1) , S60
- https://doi.org/10.1121/1.2018957
Abstract
It has previously been demonstrated that reliable, speaker‐trained, isolated word recognition on a 1109‐word Basic English vocabulary can be performed using word templates formed by concatenation of elements from a corpus of demisyllables. Since a dynamic time warping (DTW) algorithm is used to align test and reference patterns, small to moderate differences in duration between test and reference words present no major problem in performing the time alignment. However, improved results (i.e., smaller word distances) are obtained from the DTW algorithm if the syllables of the test and reference words are properly aligned prior to dynamic time warping. In our earlier experiments, each concatenated reference word was linearly prenormalized to the duration of that word in the test set, but this procedure is clearly not applicable for continuous speech recognition. We have now developed a linguistically based set of duration rules which we apply to the demisyllables during the word‐creation process (i.e., before DTW), which predict syllable duration as a function of syllable stress level and the position of the syllable within the word. Using the automatic duration rules, we have achieved recognition accuracies comparable to those based on known word durations.Keywords
This publication has 0 references indexed in Scilit: