Efficient query evaluation using a two-level retrieval process
Top Cited Papers
- 3 November 2003
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 426-434
- https://doi.org/10.1145/956863.956944
Abstract
We present an efficient query evaluation method based on a two level approach: at the first level, our method iterates in parallel over query term postings and identifies candidate documents using an approximate evaluation taking into account only partial information on term occurrences and no query independent factors; at the second level, promising candidates are fully evaluated and their exact scores are computed. The efficiency of the evaluation process can be improved significantly using dynamic pruning techniques with very little cost in effectiveness. The amount of pruning can be controlled by the user as a function of time allocated for query evaluation. Experimentally, using the TREC Web Track data, we have determined that our algorithm significantly reduces the total number of full evaluations by more than 90%, almost without any loss in precision or recall. At the heart of our approach there is an efficient implementation of a new Boolean construct called WAND or Weak AND that might be of independent interest.Keywords
This publication has 12 references indexed in Scilit:
- Term-ordered query evaluation versus document-ordered query evaluation for large document databasesPublished by Association for Computing Machinery (ACM) ,1998
- Compressed inverted files with reduced decoding overheadsPublished by Association for Computing Machinery (ACM) ,1998
- The anatomy of a large-scale hypertextual Web search engineComputer Networks and ISDN Systems, 1998
- Query evaluation: Strategies and optimizationsInformation Processing & Management, 1995
- A fast match for continuous speech recognition using allophonic modelsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1992
- Optimal semijoins for distributed database systemsIEEE Transactions on Software Engineering, 1990
- Full text indexing based on lexical relations an application: software librariesPublished by Association for Computing Machinery (ACM) ,1989
- Optimization of inverted vector searchesPublished by Association for Computing Machinery (ACM) ,1985
- Extended Boolean information retrievalCommunications of the ACM, 1983
- Space/time trade-offs in hash coding with allowable errorsCommunications of the ACM, 1970