Detection and Characterization of Cluster Substructure I. Linear Structure: Fuzzy c-Lines

Abstract
In Part I, a generalization of the Fuzzy c-Means (or Fuzzy ISODATA) clustering algorithms is developed. Necessary conditions for minimization of a generalized total weighted squared orthogonal error objective function lead to a Picard iteration scheme which generates simultaneously (i) c fuzzy clusters in the data; (ii) a set of c prototypical straight lines in feature space which best fit the data in a well-defined sense; (iii) a set of c prototpyical centers of mass (on the c lines) which characterize the “core” of each linear fuzzy cluster. Theoretical optimization is achieved using principal components of generalized within cluster fuzzy scatter matrices. A convergence theorem for each algorithm in the infinite family is given. The algorithms are exemplified by five numerical examples using both real and artificial data sets having essentially “linear” substructure. In Part II, the Fuzzy c-Means and Fuzzy c-Lines algorithms are shown to be special cases of a more general class of fuzzy algorithms, the...

This publication has 9 references indexed in Scilit: