Multi-Engine Search and Comparison Using the MetaCrawler

Abstract
Standard Web search services, though useful, are far from ideal. There are over a dozen different search services currently in existence, each with a unique interface and a database covering a different portion of the Web. As a result, users are forced to repeatedly try and retry their queries across different services. Furthermore, the services return many responses that are irrelevant, outdated, or unavailable, forcing the user to manually sift through the responses searching for useful information. This paper presents the MetaCrawler, a fielded Web service that represents the next level up in the information "food chain." The MetaCrawler provides a single, central interface for Web document searching. Upon receiving a query, the MetaCrawler posts the query to multiple search services in parallel, collates the returned references, and loads those references to verify their existence and to ensure that they contain relevant information. The MetaCrawler is sufficiently lightweight to reside on a user's machine, which facilitates customization, privacy, sophisticated filtering of references, and more. The MetaCrawler also serves as a tool for comparison of diverse search services. Using the MetaCrawler's data, we present a "Consumer Reports" evaluation of six Web search services: Galaxy [5], InfoSeek [1], Lycos [15], Open Text [20], WebCrawler [22], and Yahoo [9]. In addition, we also report on the most commonly submitted queries to the MetaCrawler.