Efficient and effective metasearch for text databases incorporating linkages among documents
- 1 May 2001
- proceedings article
- Published by Association for Computing Machinery (ACM)
- Vol. 30 (2) , 187-198
- https://doi.org/10.1145/375663.375684
Abstract
Linkages among documents have a significant impact on the importance of documents, as it can be argued that important documents are pointed to by many documents or by other important documents. Metasearch engines can be used to facilitate ordinary users for retrieving information from multiple local sources (text databases). There is a search engine associated with each database. In a large-scale metasearch engine, the contents of each local database is represented by a representative. Each user query is evaluated against he set of representatives of all databases in order to determine the appropriate databases (search engines) to search (invoke) In previous word, the linkage information between documents has not been utilized in determining the appropriate databases to search. In this paper, such information is employed to determine the degree of relevance of a document with respect to a given query. Specifically, the importance (rank) of each document as determined by the linkages is integrated in each database representative to facilitate the selection of databases for each given query. We establish a necessary and sufficient condition to rank databases optimally, while incorporating the linkage information. A method is provided to estimate the desired quantities stated in the necessary and sufficient condition. The estimation method runs in time linearly proportional to the number of query terms. Experimental results are provided to demonstrate the high retrieval effectiveness of the method.Keywords
This publication has 18 references indexed in Scilit:
- A methodology to retrieve text documents from multiple databasesIEEE Transactions on Knowledge and Data Engineering, 2002
- A statistical method for estimating the usefulness of text databasesIEEE Transactions on Knowledge and Data Engineering, 2002
- The impact of database selection on distributed searchingPublished by Association for Computing Machinery (ACM) ,2000
- Efficient and effective metasearch for a large number of text databasesPublished by Association for Computing Machinery (ACM) ,1999
- Infoseek's experiences searching the internetACM SIGIR Forum, 1998
- Real life information retrieval: a study of user queries on the WebACM SIGIR Forum, 1998
- The MetaCrawler architecture for resource aggregation on the WebIEEE Expert, 1997
- Combining the evidence of multiple query representations for information retrievalInformation Processing & Management, 1995
- Searching distributed collections with inference networksPublished by Association for Computing Machinery (ACM) ,1995
- ALIWEB - Archie-like indexing in the WEBComputer Networks and ISDN Systems, 1994