Lexical access to large vocabularies for speech recognition
- 1 January 1989
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Acoustics, Speech, and Signal Processing
- Vol. 37 (8) , 1197-1213
- https://doi.org/10.1109/29.31268
Abstract
A large vocabulary isolated word recognition system based on the hypothesize-and-test paradigm is described. The system has been, however, devised as a word hypothesizer for a continuous speech understanding system able to answer to queries put to a geographical database. Word preselection is achieved by segmenting and classifying the input signal in terms of broad phonetic classes. Due to low redundancy of this phonetic code for lexical access, to achieve high performance, a lattice of phonetic segments is generated, rather than a single sequence of hypotheses. It can be organized as a graph, and word hypothesization is obtained by matching this graph against the models of all vocabulary words. A word model is itself a phonetic representation made in terms of a graph accounting for deletion, substitution, and insertion errors. A modified Dynamic Programming (DP) matching procedure gives an efficient solution to this graph-to-graph matching problem. Hidden Markov Models (HMM's) of subword units are used as a more detailed knowledge in the verification step. The word candidates generated by the previous step are represented as sequences of diphone-like subword units, and the Viterbi algorithm is used for evaluating their likelihood. To reduce storage and computational costs, lexical knowledge is organized in a tree structure where the initial common subsequences of word descriptions are shared, and a beam-search strategy carries on the most promising paths only. The results show that a complexity reduction of about 73 percent can be achieved by using the two pass approach with respect to the direct approach, while the recognition accuracy remains comparableKeywords
This publication has 33 references indexed in Scilit:
- Phoneme classification for real time speech recognition of ItalianPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Definition and evaluation of phonetic units for speech recognition by hidden Markov modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Phoneme classification using Markov modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- The discriminative network: A mechanism for focusing recognition in whole-word pattern matchingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2005
- Experimental results on large-vocabulary continuous speech recognition and understandingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Hamlet: a prototype of a voice-activated typewriterIEE Proceedings I (Communications, Speech and Vision), 1989
- An integrated-circuit-based speech recognition systemIEEE Transactions on Acoustics, Speech, and Signal Processing, 1986
- Speech perception, word recognition and the structure of the lexiconSpeech Communication, 1985
- Language models and search algorithms for real-time speech recognitionInternational Journal of Man-Machine Studies, 1985
- An Algorithm for Vector Quantizer DesignIEEE Transactions on Communications, 1980