Automatic modeling for adding new words to a large-vocabulary continuous speech recognition system

1 January 1991

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

No. 15206149,p. 305-308 vol.1
https://doi.org/10.1109/icassp.1991.150337

Abstract

The authors report on the detection of new words for the speaker-dependent and speaker-independent paradigms. A useful operating point in a speaker-dependent paradigm is defined at 71% detection rate and 1% false alarm rate. The authors present a novel technique for obtaining a phonetic transcription for a new word, which is needed to add the new word to the system. The technique utilizes DECtalk's text-to-sound rules to obtain an initial phonetic transcription for the new word. Since these text-to-sound rules are imperfect, a probabilistic transformation technique is used that produces a phonetic pronunciation network of all possible pronunciations given DECtalk's transcription. The network is used to constrain a phonetic recognition process that results in an improved phonetic transcription for the new word. The resulting transcriptions are sufficient for speech recognition purposes.<>

Keywords

This publication has 7 references indexed in Scilit:

BYBLOS: The BBN continuous speech recognition system
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
An information theoretic approach to the automatic determination of phonemic baseforms
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Classifying words for improved statistical language models
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Automatic detection of new words in a large vocabulary continuous speech recognition system
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A new paradigm for speaker-independent training and speaker adaptation
Published by Association for Computational Linguistics (ACL) ,1990
Automatic phonetic baseform determination
Published by Association for Computational Linguistics (ACL) ,1990
Estimation of probabilities from sparse data for the language model component of a speech recognizer
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1987