Optimizing result prefetching in web search engines with segmented indices
- 1 February 2004
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Internet Technology
- Vol. 4 (1) , 31-59
- https://doi.org/10.1145/967030.967032
Abstract
We study the process in which search engines with segmented indices serve queries. In particular, we investigate the number of result pages that search engines should prepare during the query processing phase.Search engine users have been observed to browse through very few pages of results for queries that they submit. This behavior of users suggests that prefetching many results upon processing an initial query is not efficient, since most of the prefetched results will not be requested by the user who initiated the search. However, a policy that abandons result prefetching in favor of retrieving just the first page of search results might not make optimal use of system resources either.We argue that for a certain behavior of users, engines should prefetch a constant number of result pages per query. We define a concrete query processing model for search engines with segmented indices, and analyze the cost of such prefetching policies. Based on these costs, we show how to determine the constant that optimizes the prefetching policy. Our results are mostly applicable to local index partitions of the inverted files, but are also applicable to processing short queries in global index architectures.Keywords
This publication has 12 references indexed in Scilit:
- Computer programming as an artPublished by Association for Computing Machinery (ACM) ,2007
- Web search for a planet: the google cluster architectureIEEE Micro, 2003
- Searching the WebACM Transactions on Internet Technology, 2001
- Real life, real users, and real needs: a study and analysis of user queries on the webInformation Processing & Management, 2000
- Evaluating the performance of distributed architectures for information retrieval using a variety of workloadsACM Transactions on Information Systems, 2000
- Balanced AllocationsSIAM Journal on Computing, 1999
- Searching the World Wide WebScience, 1998
- The anatomy of a large-scale hypertextual Web search engineComputer Networks and ISDN Systems, 1998
- Numerical recipes: the art of scientific computingAnalytica Chimica Acta, 1987
- Some applications of two approximations to the multinomial distributionBiometrika, 1960