Lexical access to large vocabularies for speech recognition

1 January 1989

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Acoustics, Speech, and Signal Processing

Vol. 37 (8) , 1197-1213
https://doi.org/10.1109/29.31268

Abstract

A large vocabulary isolated word recognition system based on the hypothesize-and-test paradigm is described. The system has been, however, devised as a word hypothesizer for a continuous speech understanding system able to answer to queries put to a geographical database. Word preselection is achieved by segmenting and classifying the input signal in terms of broad phonetic classes. Due to low redundancy of this phonetic code for lexical access, to achieve high performance, a lattice of phonetic segments is generated, rather than a single sequence of hypotheses. It can be organized as a graph, and word hypothesization is obtained by matching this graph against the models of all vocabulary words. A word model is itself a phonetic representation made in terms of a graph accounting for deletion, substitution, and insertion errors. A modified Dynamic Programming (DP) matching procedure gives an efficient solution to this graph-to-graph matching problem. Hidden Markov Models (HMM's) of subword units are used as a more detailed knowledge in the verification step. The word candidates generated by the previous step are represented as sequences of diphone-like subword units, and the Viterbi algorithm is used for evaluating their likelihood. To reduce storage and computational costs, lexical knowledge is organized in a tree structure where the initial common subsequences of word descriptions are shared, and a beam-search strategy carries on the most promising paths only. The results show that a complexity reduction of about 73 percent can be achieved by using the two pass approach with respect to the direct approach, while the recognition accuracy remains comparable

Keywords

This publication has 33 references indexed in Scilit:

Phoneme classification for real time speech recognition of Italian
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Definition and evaluation of phonetic units for speech recognition by hidden Markov models
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Phoneme classification using Markov models
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
The discriminative network: A mechanism for focusing recognition in whole-word pattern matching
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Experimental results on large-vocabulary continuous speech recognition and understanding
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Hamlet: a prototype of a voice-activated typewriter
IEE Proceedings I (Communications, Speech and Vision), 1989
An integrated-circuit-based speech recognition system
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1986
Speech perception, word recognition and the structure of the lexicon
Speech Communication, 1985
Language models and search algorithms for real-time speech recognition
International Journal of Man-Machine Studies, 1985
An Algorithm for Vector Quantizer Design
IEEE Transactions on Communications, 1980