Two decades of statistical language modeling: where do we go from here?
Top Cited Papers
- 1 August 2000
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in Proceedings of the IEEE
- Vol. 88 (8) , 1270-1278
- https://doi.org/10.1109/5.880083
Abstract
Statistical language models estimate the distribution of various natural language phenomena for the purpose of speech recognition and other language technologies. Since the first significant model was proposed in 1980, many attempts have been made to improve the state of the art. We review them, point to a few promising directions, and argue for a Bayesian approach to integration of linguistic theories with data.Keywords
This publication has 49 references indexed in Scilit:
- Variable n-grams and extensions for conversational speech language modelingIEEE Transactions on Speech and Audio Processing, 2000
- Large vocabulary speech recognition with multispan statistical language modelsIEEE Transactions on Speech and Audio Processing, 2000
- A survey of smoothing techniques for ME modelsIEEE Transactions on Speech and Audio Processing, 2000
- Modeling long distance dependence in language: topic mixtures versus dynamic cache modelsIEEE Transactions on Speech and Audio Processing, 1999
- A multispan language modeling framework for large vocabulary speech recognitionIEEE Transactions on Speech and Audio Processing, 1998
- Indexing by latent semantic analysisJournal of the American Society for Information Science, 1990
- A tree-based statistical language model for natural language speech recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1989
- Perplexity—a measure of the difficulty of speech recognition tasksThe Journal of the Acoustical Society of America, 1977
- Generalized Iterative Scaling for Log-Linear ModelsThe Annals of Mathematical Statistics, 1972
- Information Theory and Statistical MechanicsPhysical Review B, 1957