Phenetic Clustering in Biology: A Critique

Abstract
Phenetic clustering, the forming of hierarchical nonoverlapping groups strictly according to degree of similarity, has serious shortcomings as it is commonly used in biology. When used as a method for estimating phylogeny, phenetic clustering rests on a questionable assumption of correspondence between similarity and recency of common ancestry. This compromises its ability to reconstruct the correct branching sequence when rates of evolutionary divergence are unequal among lineages, as well as causing it to obscure rate differences even when the branching sequences is reconstructed correctly. When used as a method for analysing patterns for geographic variation and genetic continuity among populations, phenetic clustering rests on a questionable assumption of correspondence between similarity and degree of genetic continuity. This compromises its ability to identify genetically continuous units when their component populations are differentiated, and combined with its sensitivity to uneven geographic sampling, it can cause the method to yield misleading results if sampling patterns are not taken into consideration. Finally, even when used simply as a method for analysing patterns of similarity without regard to causal processes, phenetic clustering rests on a questionable assumption of nested hierarchical structure. This compromises its ability to represent similarity relationships accurately when those relationships exhibit a significant nonhierarchical component. For all of the common biological applications of phenetic clustering, there exist alternative analytical methods that do not suffer from the problems associated with phenetic clustering. The problems in question result not from the phenetic (similarity) data themselves, which often can be analysed in more appropriate ways, but from the phenetic clustering as well as the advantages of alternative methods have been known for many years. Advocacy of phenetic clustering at the expense of more appropriate methods can be explained as the result of constraints imposed by an implicit assumption of nested hierarchies that was part of the taxonomic context within which the methods were developed.