Searching the Web
Top Cited Papers
Open Access
- 1 August 2001
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Internet Technology
- Vol. 1 (1) , 2-43
- https://doi.org/10.1145/383034.383035
Abstract
We offer an overview of current Web search engine design. After introducing a generic search engine architecture, we examine each engine component in turn. We cover crawling, local Web page storage, indexing, and the use of link analysis for boosting search performance. The most common design and implementation techniques for each of these components are presented. For this presentation we draw from the literature and from our own experimental search engine testbed. Emphasis is on introducing the fundamental concepts and the results of several performance analyses we conducted to compare different designs.Keywords
This publication has 37 references indexed in Scilit:
- Authoritative sources in a hyperlinked environmentJournal of the ACM, 1999
- Enhanced hypertext categorization using hyperlinksACM SIGMOD Record, 1998
- The Connectivity Server: fast access to linkage information on the WebComputer Networks and ISDN Systems, 1998
- Searching the World Wide WebScience, 1998
- The anatomy of a large-scale hypertextual Web search engineComputer Networks and ISDN Systems, 1998
- Efficient crawling through URL orderingComputer Networks and ISDN Systems, 1998
- Automatic resource compilation by analyzing hyperlink structure and associated textComputer Networks and ISDN Systems, 1998
- Access methods for textACM Computing Surveys, 1985
- Signature filesACM Transactions on Information Systems, 1984
- Citation Analysis as a Tool in Journal EvaluationScience, 1972