Properties of large lexicons: Implications for advanced isolated word recognition systems
- 24 March 2005
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 7, 546-549
- https://doi.org/10.1109/icassp.1982.1171902
Abstract
As part of our goal to design large-vocabulary, phonetically-based isolated word recognition systems, we investigated the statistical properties and constraints of the phonemic structures of English words. Our database consisted of five lexicons varying in size from 1250 to 20,000 words. The lexicons included, in addition to a phonemic transcription for each word, the word's frequency of occurrence as determined from the Brown Corpus. We studied the distributions of the phonemes, both individually and by class, within the lexicon and within the corpus. Distributions of consonant clusters were also obtained. Finally, the distribution of words in terms of patterns derived from broad categorization of the phonemes was investigated. This paper summarizes the results of these studies and discusses implications for phonetically-based isolated word recognition strategies.Keywords
This publication has 4 references indexed in Scilit:
- Speaker trained isolated word recognition on a large vocabulary of wordsThe Journal of the Acoustical Society of America, 1981
- Dynamic programming algorithm optimization for spoken word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1978
- Linear prediction: A tutorial reviewProceedings of the IEEE, 1975
- Minimum prediction residual principle applied to speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1975