Replicating Cluster Analysis: Method, Consistency, and Validity

Abstract
To replicate a cluster analysis, clusters must first be described in terms of an objective classification rule. The effectiveness of three rules (nearest neighbor classification, nearest centroid assignment, and quadratic discriminant analysis) for replicating Ward's algorithm (Ward, 1963) is evaluated by Monte Carlo study. Consistent replication links clusters and their replicas identically over alternative cross-validation sequences (i.e., A replicates B, B replicates A) and is associated with recovery of known clusters. Replication using nearest neighbor classification results in superior goodness-of-fit, more frequent consistent replication, and significant prediction of recovery. Although moderate or greater replication dentoes good recovery, replication is not a necessary condition of recovery of true clusters.

This publication has 18 references indexed in Scilit: