Modeling long distance dependence in language: topic mixtures vs. dynamic cache models
- 24 December 2002
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 1, 236-239
- https://doi.org/10.1109/icslp.1996.607085
Abstract
In this paper, we investigate a new statistical language model which captures topic-related dependencies of words within and across sen- tences. First, we develop a sentence-level mixture language model that takes advantage of the topic constraints in a sentence or article. Second, we introduce topic-dependent dynamic cache adaptation techniques in the framework of the mixture model. Experiments with the static (or unadapted) mixture model on the 1994 WSJ task indicated a 21% reduction in perplexity and a 3-4% improvement in recognition accuracy over a general -gram model. The static mix- ture model also improved recognition performance over an adapted -gram model. Mixture adaptation techniques contributed a further 14% reduction in perplexity and a small improvement in recognition accuracy.Keywords
This publication has 8 references indexed in Scilit:
- A hybrid approach to adaptive statistical language modelingPublished by Association for Computational Linguistics (ACL) ,1994
- Language modeling with sentence-level mixturesPublished by Association for Computational Linguistics (ACL) ,1994
- On the dynamic adaptation of stochastic language modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- Trigger-based language models: a maximum entropy approachPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- Statistical language modeling combining N-gram and context-free grammarsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- The estimation of powerful language models from small and large corporaPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1993
- A dynamic language model for speech recognitionPublished by Association for Computational Linguistics (ACL) ,1991
- A cache-based natural language model for speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1990