Data clustering
Open Access
- 1 September 1999
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Computing Surveys
- Vol. 31 (3) , 264-323
- https://doi.org/10.1145/331499.331504
Abstract
Clustering is the unsupervised classification of patterns (observations, data items, or feature vectors) into groups (clusters). The clustering problem has been addressed in many contexts and by researchers in many disciplines; this reflects its broad appeal and usefulness as one of the steps in exploratory data analysis. However, clustering is a difficult problem combinatorially, and differences in assumptions and contexts in different communities has made the transfer of useful generic concepts and methodologies slow to occur. This paper presents an overview of pattern clustering methods from a statistical pattern recognition perspective, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners. We present a taxonomy of clustering techniques, and identify cross-cutting themes and recent advances. We also describe some important applications of clustering algorithms such as image segmentation, object recognition, and information retrieval.Keywords
This publication has 128 references indexed in Scilit:
- Unsupervised texture segmentation using Gabor filtersPattern Recognition, 1991
- A comparison between conceptual clustering and conventional clusteringPattern Recognition, 1990
- Decision trees and decision-makingIEEE Transactions on Systems, Man, and Cybernetics, 1990
- Multidimensional data clustering utilizing hybrid search strategiesPattern Recognition, 1989
- Experiments in projection and clustering by simulated annealingPattern Recognition, 1989
- Low-level segmentation of multispectral images via agglomerative clustering of uniform neighbourhoodsPattern Recognition, 1988
- How many clusters are best? - An experimentPattern Recognition, 1987
- A Graph-Theoretic Approach to Goodness-of-Fit in Complete-Link Hierarchical ClusteringJournal of the American Statistical Association, 1976
- Step-Wise Clustering ProceduresJournal of the American Statistical Association, 1967
- Hierarchical Grouping to Optimize an Objective FunctionJournal of the American Statistical Association, 1963