Efficient algorithms for predicting requests to Web servers

Abstract
Internet traffic has grown significantly with the popularity of the Web. Consequently user perceived latency in retrieving Web pages has increased. Caching and prefetching at the client side, aided by hints from the server, are attempts at solving this problem. We suggest techniques to group resources that are likely to be accessed together into volumes, which are used to generate hints tailored to individual applications, such as prefetching, cache replacement, and cache validation. We discuss theoretical aspects of optimal volume construction, and develop efficient heuristics. Tunable parameters allow our algorithms to predict as many accesses as possible while reducing false predictions and limiting the size of hints. We analyze a collection of large server logs, extracting access patterns to construct and evaluate volumes. We examine sampling techniques to process only portions of the server logs while constructing equally good volumes. We show that it is possible to predict requests at low cost with a high degree of precision.

This publication has 8 references indexed in Scilit: