Query-based sampling of text databases
- 1 April 2001
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Information Systems
- Vol. 19 (2) , 97-130
- https://doi.org/10.1145/382979.383040
Abstract
The proliferation of searchable text databases on corporate networks and the Internet causes a database selection problem for many people. Algorithms such as gGLOSS and CORI can automatically select which text databases to search for a given information need, but only if given a set of resource descriptions that accurately represent the contents of each database. The existing techniques for a acquiring resource descriptions have significant limitations when used in wide-area networks controlled by many parties. This paper presents query-based sampling , a new technicque for acquiring accurate resource descriptions. Query-based sampling does not require the cooperation of resource providers, nor does it require that resource providers use a particular search engine or representation technique. An extensive set of experimental results demonstrates that accurate resource descriptions are crated, that computation and communication costs are reasonable, and that the resource descriptions do in fact enable accurate automatic dtabase selection.Keywords
This publication has 13 references indexed in Scilit:
- Server selection on the World Wide WebPublished by Association for Computing Machinery (ACM) ,2000
- A decision-theoretic approach to database selection in networked IRACM Transactions on Information Systems, 1999
- Automatic discovery of language models for text databasesPublished by Association for Computing Machinery (ACM) ,1999
- Methods for information server selectionACM Transactions on Information Systems, 1999
- Multiple search engines in database mergingPublished by Association for Computing Machinery (ACM) ,1997
- HyPursuitPublished by Association for Computing Machinery (ACM) ,1996
- TREC and TIPSTER experiments with inqueryInformation Processing & Management, 1995
- Searching distributed collections with inference networksPublished by Association for Computing Machinery (ACM) ,1995
- Evaluation of an inference network-based retrieval modelACM Transactions on Information Systems, 1991
- An experimental comparison of the effectiveness of computers and humans as search intermediariesJournal of the American Society for Information Science, 1983