Local versus global link information in the Web
- 1 January 2003
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Information Systems
- Vol. 21 (1) , 42-63
- https://doi.org/10.1145/635484.635486
Abstract
Information derived from the cross-references among the documents in a hyperlinked environment, usually referred to as link information, is considered important since it can be used to effectively improve document retrieval. Depending on the retrieval strategy, link information can be local or global. Local link information is derived from the set of documents returned as answers to the current user query. Global link information is derived from all the documents in the collection. In this work, we investigate how the use of local link information compares to the use of global link information. For the comparison, we run a series of experiments using a large document collection extracted from the Web. For our reference collection, the results indicate that the use of local link information improves precision by 74%. When global link information is used, precision improves by 35%. However, when only the first 10 documents in the ranking are considered, the average gain in precision obtained with the use of global link information is higher than the gain obtained with the use of local link information. This is an interesting result since it provides insight and justification for the use of global link information in major Web search engines, where users are mostly interested in the first 10 answers. Furthermore, global information can be computed in the background, which allows speeding up query processing.Keywords
This publication has 15 references indexed in Scilit:
- SALSAACM Transactions on Information Systems, 2001
- Improving the effectiveness of information retrieval with local context analysisACM Transactions on Information Systems, 2000
- The anatomy of a large-scale hypertextual Web search engineComputer Networks and ISDN Systems, 1998
- Automatic resource compilation by analyzing hyperlink structure and associated textComputer Networks and ISDN Systems, 1998
- On modeling information retrieval with probabilistic inferenceACM Transactions on Information Systems, 1995
- Evaluation of an inference network-based retrieval modelACM Transactions on Information Systems, 1991
- Co‐citation in the scientific literature: A new measure of the relationship between two documentsJournal of the American Society for Information Science, 1973
- Citation Analysis as a Tool in Journal EvaluationScience, 1972
- AUTOMATIC INDEXING USING BIBLIOGRAPHIC CITATIONSJournal of Documentation, 1971
- Bibliographic coupling between scientific papersAmerican Documentation, 1963