Building efficient and effective metasearch engines
Top Cited Papers
- 1 March 2002
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Computing Surveys
- Vol. 34 (1) , 48-89
- https://doi.org/10.1145/505282.505284
Abstract
Frequently a user's information needs are stored in the databases of multiple search engines. It is inconvenient and inefficient for an ordinary user to invoke multiple search engines and identify useful documents from the returned results. To support unified access to multiple search engines, a metasearch engine can be constructed. When a metasearch engine receives a query from a user, it invokes the underlying search engines to retrieve useful information for the user. Metasearch engines have other benefits as a search tool such as increasing the search coverage of the Web and improving the scalability of the search. In this article, we survey techniques that have been proposed to tackle several underlying challenges for building a good metasearch engine. Among the main challenges, the database selection problem is to identify search engines that are likely to return useful documents to a given query. The document selection problem is to determine what documents to retrieve from each identified search engine. The result merging problem is to combine the documents returned from multiple search engines. We will also point out some problems that need to be further researched.Keywords
This publication has 31 references indexed in Scilit:
- A statistical method for estimating the usefulness of text databasesIEEE Transactions on Knowledge and Data Engineering, 2002
- Methods for information server selectionACM Transactions on Information Systems, 1999
- Infoseek's experiences searching the internetACM SIGIR Forum, 1998
- Inquirus, the NECI meta search engineComputer Networks and ISDN Systems, 1998
- The anatomy of a large-scale hypertextual Web search engineComputer Networks and ISDN Systems, 1998
- The MetaCrawler architecture for resource aggregation on the WebIEEE Expert, 1997
- Boolean similarity measures for resource discoveryIEEE Transactions on Knowledge and Data Engineering, 1997
- Wide Area Technical Report ServiceCommunications of the ACM, 1995
- ALIWEB - Archie-like indexing in the WEBComputer Networks and ISDN Systems, 1994
- SIBRIS: the Sandwich Interactive Browsing and Ranking Information SystemJournal of Information Science, 1989