A novel dependency language model for information retrieval
- 1 May 2007
- journal article
- Published by Zhejiang University Press in Journal of Zhejiang University-SCIENCE A
- Vol. 8 (6) , 871-882
- https://doi.org/10.1631/jzus.2007.a0871
Abstract
This paper explores the application of term dependency in information retrieval (IR) and proposes a novel dependency retrieval model. This retrieval model suggests an extension to the existing language modeling (LM) approach to IR by introducing dependency models for both query and document. Relevance between document and query is then evaluated by reference to the Kullback-Leibler divergence between their dependency models. This paper introduces a novel hybrid dependency structure, which allows integration of various forms of dependency within a single framework. A pseudo relevance feedback based method is also introduced for constructing query dependency model. The basic idea is to use query-relevant top-ranking sentences extracted from the top documents at retrieval time as the augmented representation of query, from which the relationships between query terms are identified. A Markov Random Field (MRF) based approach is presented to ensure the relevance of the extracted sentences, which utilizes the association features between query terms within a sentence to evaluate the relevance of each sentence. This dependency retrieval model was compared with other traditional retrieval models. Experiments indicated that it produces significant improvements in retrieval effectiveness.Keywords
This publication has 26 references indexed in Scilit:
- Integrating word relationships into language modelsPublished by Association for Computing Machinery (ACM) ,2005
- Dependence language model for information retrievalPublished by Association for Computing Machinery (ACM) ,2004
- Document language models, query models, and risk minimization for information retrievalPublished by Association for Computing Machinery (ACM) ,2001
- Maximum a posteriori estimation for multivariate Gaussian mixture observations of Markov chainsIEEE Transactions on Speech and Audio Processing, 1994
- The use of phrases and structured queries in information retrievalPublished by Association for Computing Machinery (ACM) ,1991
- Estimation of probabilities from sparse data for the language model component of a speech recognizerIEEE Transactions on Acoustics, Speech, and Signal Processing, 1987
- Automatic phrase indexing for document retrievalPublished by Association for Computing Machinery (ACM) ,1987
- FASIT: A fully automatic syntactically based indexing systemJournal of the American Society for Information Science, 1983
- The Description of a Random Field by Means of Conditional Probabilities and Conditions of Its RegularityTheory of Probability and Its Applications, 1968
- Dependency Theory: A Formalism and Some ObservationsLanguage, 1964