An Examination of Procedures for Determining the Number of Clusters in a Data Set
- 1 June 1985
- journal article
- Published by Cambridge University Press (CUP) in Psychometrika
- Vol. 50 (2) , 159-179
- https://doi.org/10.1007/bf02294245
Abstract
A Monte Carlo evaluation of 30 procedures for determining the number of clusters was conducted on artificial data sets which contained either 2, 3, 4, or 5 distinct nonoverlapping clusters. To provide a variety of clustering solutions, the data sets were analyzed by four hierarchical clustering methods. External criterion measures indicated excellent recovery of the true cluster structure by the methods at the correct hierarchy level. Thus, the clustering present in the data was quite strong. The simulation results for the stopping rules revealed a wide range in their ability to determine the correct number of clusters in the data. Several procedures worked fairly well, whereas others performed rather poorly. Thus, the latter group of rules would appear to have little validity, particularly for data sets containing distinct clusters. Applied researchers are urged to select one or more of the better criteria. However, users are cautioned that the performance of some of the criteria may be data dependent.Keywords
This publication has 46 references indexed in Scilit:
- A Comparison of Four Clustering Methods Using MMPI Monte Carlo DataApplied Psychological Measurement, 1980
- A Test for ClustersJournal of Marketing Research, 1979
- A Cluster Separation MeasureIEEE Transactions on Pattern Analysis and Machine Intelligence, 1979
- Validity studies in clustering methodologiesPattern Recognition, 1979
- Bayesian cluster analysisBiometrika, 1978
- Measuring the Power of Hierarchical Cluster AnalysisJournal of the American Statistical Association, 1975
- Plots of High-Dimensional DataPublished by JSTOR ,1972
- Estimating the components of a mixture of normal distributionsBiometrika, 1969
- Estimation in Mixtures of Two Normal DistributionsTechnometrics, 1967
- A Method for Detecting Subgroups in a Population and Specifying their MembershipThe Journal of Psychology, 1963