Applying summarization techniques for term selection in relevance feedback
- 1 September 2001
- proceedings article
- Published by Association for Computing Machinery (ACM)
Abstract
Query-expansion is an effective Relevance Feedback technique for improving performance in Information Retrieval. In general query-expansion methods select terms from the complete contents of relevant documents. One problem with this approach is that expansion terms unrelated to document relevance can be introduced into the modified query due to their presence in the relevant documents and distribution in the document collection. Motivated by the hypothesis that query-expansion terms should only be sought from the most relevant areas of a document, this investigation explores the use of document summaries in query-expansion. The investigation explores the use of both context-independent standard summaries and query-biased summaries. Experimental results using the Okapi BM25 probabilistic retrieval model with the TREC-8 ad hoc retrieval task show that query-expansion using document summaries can be considerably more effective than using full-document expansion. The paper also presents a novel approach to term-selection that separates the choice of relevant documents from the selection of a pool of potential expansion terms. Again, this technique is shown to be more effective that standard methods.Keywords
This publication has 9 references indexed in Scilit:
- Advantages of query biased summaries in information retrievalPublished by Association for Computing Machinery (ACM) ,1998
- Improving automatic query expansionPublished by Association for Computing Machinery (ACM) ,1998
- Summarization-based query expansion in information retrievalPublished by Association for Computational Linguistics (ACL) ,1998
- Query expansion using local and global document analysisPublished by Association for Computing Machinery (ACM) ,1996
- Relevance feedback with too much dataPublished by Association for Computing Machinery (ACM) ,1995
- ON TERM SELECTION FOR QUERY EXPANSIONJournal of Documentation, 1990
- Constructing literature abstracts by computer: Techniques and prospectsInformation Processing & Management, 1990
- An algorithm for suffix strippingProgram: electronic library and information systems, 1980
- New Methods in Automatic ExtractingJournal of the ACM, 1969