An Algorithm for Generating Artificial Test Clusters
- 1 March 1985
- journal article
- Published by Cambridge University Press (CUP) in Psychometrika
- Vol. 50 (1) , 123-127
- https://doi.org/10.1007/bf02294153
Abstract
An algorithm for generating artificial data sets which contain distinct nonoverlapping clusters is presented. The algorithm is useful for generating test data sets for Monte Carlo validation research conducted on clustering methods or statistics. The algorithm generates data sets which contain either 1, 2, 3, 4, or 5 clusters. By default, the data are embedded in either a 4, 6, or 8 dimensional space. Three different patterns for assigning the points to the clusters are provided. One pattern assigns the points equally to the clusters while the remaining two schemes produce clusters of unequal sizes. Finally, a number of methods for introducing error in the data have been incorporated in the algorithm.Keywords
This publication has 15 references indexed in Scilit:
- Monte Carlo comparisons of selected clustering proceduresPublished by Elsevier ,2003
- A Review Of Monte Carlo Tests Of Cluster AnalysisMultivariate Behavioral Research, 1981
- A Monte Carlo Study of Thirty Internal Criterion Measures for Cluster AnalysisPsychometrika, 1981
- A NOTE ON PROCEDURES FOR TESTING THE QUALITY OF A CLUSTERING OF A SET OF OBJECTSDecision Sciences, 1980
- An Examination of the Effect of Six Types of Error Perturbation on Fifteen Clustering AlgorithmsPsychometrika, 1980
- A Comparison of Four Clustering Methods Using MMPI Monte Carlo DataApplied Psychological Measurement, 1980
- The validation of four ultrametric clustering algorithmsPattern Recognition, 1980
- Mixture Model Tests Of Hierarchical Clustering Algorithms: The Problem Of Classifying EverybodyMultivariate Behavioral Research, 1979
- Validity studies in clustering methodologiesPattern Recognition, 1979
- Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods.Psychological Bulletin, 1976