The Effect of Cluster Size, Dimensionality, and the Number of Clusters on Recovery of True Cluster Structure
- 1 January 1983
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- Vol. PAMI-5 (1) , 40-47
- https://doi.org/10.1109/tpami.1983.4767342
Abstract
An evaluation of four clustering methods and four external criterion measures was conducted with respect to the effect of the number of clusters, dimensionality, and relative cluster sizes on the recovery of true cluster structure. The four methods were the single link, complete link, group average (UPGMA), and Ward's minimum variance algorithms. The results indicated that the four criterion measures were generally consistent with each other, of which two highly similar pairs were identified. The tirst pair consisted of the Rand and corrected Rand statistics, and the second pair was the Jaccard and the Fowlkes and Mallows indexes. With respect to the methods, recovery was found to improve as the number of clusters increased and as the number of dimensions increased. The relative cluster size factor produced differential performance effects, with Ward's procedure providing the best recovery when the clusters were of equal size. The group average method gave equivalent or better recovery when the clusters were of unequal size.Keywords
This publication has 11 references indexed in Scilit:
- A Comparison of Cluster Analysis Techniques Withing a Sequential Validation FrameworkMultivariate Behavioral Research, 1983
- A Review Of Monte Carlo Tests Of Cluster AnalysisMultivariate Behavioral Research, 1981
- A Monte Carlo Study of Thirty Internal Criterion Measures for Cluster AnalysisPsychometrika, 1981
- A NOTE ON PROCEDURES FOR TESTING THE QUALITY OF A CLUSTERING OF A SET OF OBJECTSDecision Sciences, 1980
- An Examination of the Effect of Six Types of Error Perturbation on Fifteen Clustering AlgorithmsPsychometrika, 1980
- The validation of four ultrametric clustering algorithmsPattern Recognition, 1980
- Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods.Psychological Bulletin, 1976
- Objective Criteria for the Evaluation of Clustering MethodsJournal of the American Statistical Association, 1971
- A Review of ClassificationJournal of the Royal Statistical Society. Series A (General), 1971
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960