Clustering N Objects into K Groups under Optimal Scaling of Variables
- 1 December 1989
- journal article
- Published by Cambridge University Press (CUP) in Psychometrika
- Vol. 54 (4) , 699-706
- https://doi.org/10.1007/bf02296404
Abstract
We propose a method to reduce many categorical variables to one variable with k categories, or stated otherwise, to classify n objects into k groups. Objects are measured on a set of nominal, ordinal or numerical variables or any mix of these, and they are represented as n points in p-dimensional Euclidean space. Starting from homogeneity analysis, also called multiple correspondence analysis, the essential feature of our approach is that these object points are restricted to lie at only one of k locations. It follows that these k locations must be equal to the centroids of all objects belonging to the same group, which corresponds to a sum of squared distances clustering criterion. The problem is not only to estimate the group allocation, but also to obtain an optimal transformation of the data matrix. An alternating least squares algorithm and an example are given.Keywords
This publication has 16 references indexed in Scilit:
- Optimal variable weighting for hierarchical clustering: An alternating least-squares algorithmJournal of Classification, 1985
- Forced Classification: A Simple Application of a Quantification MethodPsychometrika, 1984
- Non‐linear canonical correlation†British Journal of Mathematical and Statistical Psychology, 1983
- Quantitative Analysis of Qualitative DataPsychometrika, 1981
- An Examination of the Effect of Six Types of Error Perturbation on Fifteen Clustering AlgorithmsPsychometrika, 1980
- Bayesian cluster analysisBiometrika, 1978
- An Algorithm for Euclidean Sum of Squares ClassificationBiometrics, 1977
- Statistische Modelle und Bayessche Verfahren zur Bestimmung einer unbekannten Klassifikation normalverteilter zufälliger VektorenMetrika, 1972
- A General Coefficient of Similarity and Some of Its PropertiesPublished by JSTOR ,1971
- On Grouping for Maximum HomogeneityJournal of the American Statistical Association, 1958