Topic adaptation for language modeling using unnormalized exponential models
- 27 November 2002
- proceedings article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. 2, 681-684
- https://doi.org/10.1109/icassp.1998.675356
Abstract
In this paper, we present novel techniques for performing topic adaptation on an -gram language model. Given training text la- beled with topic information, we automatically identify the most relevant topics for new text. We adapt our language model toward these topics using an exponential model, by adjusting probabilities in our model to agree with those found in the topical subset of the training data. For efficienc y, we do not normalize the model; that is, we do not require that the "probabilities" in the language model sum to 1. With these techniques, we were able to achieve a modest reduction in speech recognition word-error rate in the Broadcast News domain.Keywords
This publication has 9 references indexed in Scilit:
- Modeling long distance dependence in language: topic mixtures vs. dynamic cache modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Language model adaptation using mixtures and an exponentially decaying cachePublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Semantic clustering for adaptive language modelingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Adaptive topic - dependent language modelling using word - based varigramsPublished by International Speech Communication Association ,1997
- Using story topics for language model adaptationPublished by International Speech Communication Association ,1997
- Language model adaptation using dynamic marginalsPublished by International Speech Communication Association ,1997
- A maximum entropy approach to adaptive statistical language modellingComputer Speech & Language, 1996
- Optimizing lexical and N-gram coverage via judicious use of linguistic dataPublished by International Speech Communication Association ,1995
- A cache-based natural language model for speech recognitionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1990