Survey of Clustering Algorithms
Top Cited Papers
- 9 May 2005
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks
- Vol. 16 (3) , 645-678
- https://doi.org/10.1109/tnn.2005.845141
Abstract
Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics, computer science, and machine learning, and illustrate their applications in some benchmark data sets, the traveling salesman problem, and bioinformatics, a new field attracting intensive efforts. Several tightly related topics, proximity measure, and cluster validation, are also discussed.Keywords
This publication has 222 references indexed in Scilit:
- Model-Based Clustering, Discriminant Analysis, and Density EstimationJournal of the American Statistical Association, 2002
- Initial sequencing and analysis of the human genomeNature, 2001
- A gene expression database for the molecular pharmacology of cancerNature Genetics, 2000
- Distinct types of diffuse large B-cell lymphoma identified by gene expression profilingNature, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- A classification EM algorithm for clustering and two stochastic versionsComputational Statistics & Data Analysis, 1992
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- A tutorial on hidden Markov models and selected applications in speech recognitionProceedings of the IEEE, 1989
- Estimating the Dimension of a ModelThe Annals of Statistics, 1978
- A new look at the statistical model identificationIEEE Transactions on Automatic Control, 1974