A decision-theoretic approach to database selection in networked IR
- 1 July 1999
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Information Systems
- Vol. 17 (3) , 229-249
- https://doi.org/10.1145/314516.314517
Abstract
In networked IR, a client submits a query to a broker, which is in contact with a large number of databases. In order to yield a maximum number of documents at minimum cost, the broker has to make estimates about the retrieval cost of each database, and then decide for each database whether or not to use it for the current query, and if, how many documents to retrieve from it. For this purpose, we develop a general decision-theoretic model and discuss different cost structures. Besides cost for retrieving relevant versus nonrelevant documents, we consider the following parameters for each database: expected retrieval quality, expected number of relevant documents in the database and cost factors for query processing and document delivery. For computing the overall optimum, a divide-and-conquer algorithm is given. If there are several brokers knowing different databases, a preselection of brokers can only be performed heuristically, but the computation of the optimum can be done similarily to the single-broker case. In addition, we derive a formula which estimates the number of relevant documents in a database based on dictionary information.Keywords
This publication has 15 references indexed in Scilit:
- Students access books and journals through MeDocCommunications of the ACM, 1998
- Toward inquiry-based education through interacting software agentsComputer, 1996
- The Harvest information discovery and access systemComputer Networks and ISDN Systems, 1995
- On modeling information retrieval with probabilistic inferenceACM Transactions on Information Systems, 1995
- The effectiveness of GIOSS for the text database discovery problemACM SIGMOD Record, 1994
- Evaluation of an inference network-based retrieval modelACM Transactions on Information Systems, 1991
- Optimum polynomial retrieval functions based on the probability ranking principleACM Transactions on Information Systems, 1989
- A non-classical logic for information retrievalThe Computer Journal, 1986
- OUTLINE OF A GENERAL PROBABILISTIC RETRIEVAL MODELJournal of Documentation, 1983
- THE PROBABILITY RANKING PRINCIPLE IN IRJournal of Documentation, 1977