Document clustering using an inverted file approach
- 1 October 1980
- journal article
- Published by SAGE Publications in Journal of Information Science
- Vol. 2 (5) , 223-231
- https://doi.org/10.1177/016555158000200503
Abstract
An automatic document clustering procedure is described which does not require the use of an inter-document similar ity matrix and which is independent of the order in which the documents are processed. The procedure makes use of an initial set of clusters which is derived from certain of the terms in the indexing vocabulary used to characterise the documents in the file. The retrieval effectiveness obtained using the clustered file is compared with that obtained from serial searching and from use of the single-linkage clustering method.Keywords
This publication has 18 references indexed in Scilit:
- Indexing exhaustivity and the computation of similarity matricesJournal of the American Society for Information Science, 1980
- Unresolved Problems in Cluster AnalysisPublished by JSTOR ,1979
- Clustering large files of documents using the single‐link methodJournal of the American Society for Information Science, 1977
- An efficient algorithm for a complete link methodThe Computer Journal, 1977
- Document clustering: An evaluation of some experiments with the cranfield 1400 collectionInformation Processing & Management, 1975
- A file organization and maintenance procedure for dynamic document collectionsInformation Processing & Management, 1975
- The effect of document ordering in rocchio's clustering algorithmJournal of the American Society for Information Science, 1973
- SLINK: An optimally efficient algorithm for the single-link cluster methodThe Computer Journal, 1973
- The use of hierarchic clustering in information retrievalInformation Storage and Retrieval, 1971
- Controversy concerning the criteria for taxonometric strategiesThe Computer Journal, 1971