A bootstrapping training technique for obtaining demisyllable reference patterns
- 1 June 1982
- journal article
- research article
- Published by Acoustical Society of America (ASA) in The Journal of the Acoustical Society of America
- Vol. 71 (6) , 1588-1595
- https://doi.org/10.1121/1.387813
Abstract
The process of obtaining reference patterns for syllablelike units is tedious, error prone, and time consuming. As such, speech recognition systems based on such units are usually tested on only a single talker. In this paper we describe a procedure for using demisyllable reference patterns, excised from spoken utterances for one talker and automatically creating demisyllable reference patterns for a new talker. The procedure is based on dynamic time warping alignment of the spoken utterances (each containing the relevant demisyllable), and the assumption that the optimum warping path identifies the best matching demisyllable within the utterance. The automatic procedure has been used to create demisyllable reference patterns for two new talkers, each talking over a dialed-up telephone line (the original recordings were made over a high-quality microphone). Reference patterns were made for 100 isolated words from a given lexical specification of each word in terms of demisyllables in the inventory. Word recognition accuracies greater than 90% were obtained for both talkers on the 100-word vocabulary. In a second experiment, using a 1109-word vocabulary, recognition accuracies from reference patterns based on automatically extracted demisyllables were from 2%–5% worse than accuracies from reference patterns based on hand corrections applied to the demisyllables. These results show that the automatic demisyllable extraction technique provides a very good first pass set of demisyllables, and that combined with some manual corrections, provides a set of demisyllable patterns suitable for use in a recognition system.This publication has 1 reference indexed in Scilit:
- Isolated and Connected Word Recognition--Theory and Selected ApplicationsIEEE Transactions on Communications, 1981