A document retrieval system based on nearest neighbour searching
- 1 February 1988
- journal article
- Published by SAGE Publications in Journal of Information Science
- Vol. 14 (1) , 25-33
- https://doi.org/10.1177/016555158801400104
Abstract
Document filing and retrieval systems can be designed using advanced techniques resulting from recent research in information retneval. In this paper, a document retneval system is presented, based upon the vector processing model. The system employs an automatic indexing procedure with a weighting scheme to reflect term importance. Documents are stored using an in verted file organization. Natural language quenes are sup ported with a retrieval strategy based on best match techniques and relevance feedback. The emphasis is on nearest neighbour searching to locate documents closest to a given query. That means, after having defined a sirrularitv function, the identification of those docu ments in the collection which exhibit a higher degree of re semblance to the query. The problem is introduced with reference to a straightfor ward search procedure that returns the nearest neighbour set manipulating the inverted file entnes. Then. an improved al gorithm is presented which optimizes both the number of documents to be evaluated and the number of inverted lists to be inspected.Keywords
This publication has 13 references indexed in Scilit:
- Adapting a data organization to the structure of stored informationPublished by Springer Nature ,2005
- An intelligent terminal for implementing relevance feedback on large operational retrieval systemsPublished by Springer Nature ,2005
- The implementation of a document retrieval systemPublished by Springer Nature ,2005
- Another look at automatic text-retrieval systemsCommunications of the ACM, 1986
- Optimization of inverted vector searchesPublished by Association for Computing Machinery (ACM) ,1985
- The data-document distinction in information retrievalCommunications of the ACM, 1984
- A review of the use of inverted files for best match searching in information retrieval systemsJournal of Information Science, 1983
- A blueprint for automatic indexingACM SIGIR Forum, 1981
- Optimal Expected-Time Algorithms for Closest Point ProblemsACM Transactions on Mathematical Software, 1980
- Indexing exhaustivity and the computation of similarity matricesJournal of the American Society for Information Science, 1980