Efficient passage ranking for document databases
- 1 October 1999
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Information Systems
- Vol. 17 (4) , 406-439
- https://doi.org/10.1145/326440.326445
Abstract
Queries to text collections are resolved by ranking the documents in the collection and returning the highest-scoring documents to the user. An alternative retrieval method is to rank passages, that is, short fragments of documents, a strategy that can improve effectiveness and identify relevant material in documents that are too large for users to consider as a whole. However, ranking of passages can considerably increase retrieval costs. In this article we explore alternative query evaluation techniques, and develop new tecnhiques for evaluating queries on passages. We show experimentally that, appropriately implemented, effective passage retrieval is practical in limited memory on a desktop machine. Compared to passage ranking with adaptations of current document ranking algorithms, our new “DO-TOS” passage-ranking algorithm requires only a fraction of the resources, at the cost of a small loss of effectiveness.Keywords
This publication has 7 references indexed in Scilit:
- Indexing Techniques for Advanced Database SystemsPublished by Springer Nature ,1997
- Self-indexing inverted files for fast text retrievalACM Transactions on Information Systems, 1996
- Filtered document retrieval with frequency‐sorted indexesJournal of the American Society for Information Science, 1996
- Overview of the Second Text Retrieval Conference (TREC-2)Information Processing & Management, 1995
- The MG retrieval systemCommunications of the ACM, 1995
- Document and Passage Retrieval Based on Hidden Markov ModelsPublished by Springer Nature ,1994
- Retrieving records from a gigabyte of text on a minicomputer using statistical rankingJournal of the American Society for Information Science, 1990