Abstract
Five methods of coding character differences for meristic or continuous variables when variability within taxa exists are discussed: (1) scaling by among-group variability; (2) scaling by within-group variability; (3) simple gap-coding; (4) homogeneous-subset coding; and (5) generalized gap-coding. Scaling by among-group variability is ineffective as it does not use within-group variability for coding purposes. Scaling by within-group variability does not effectively alleviate computational and theoretical problems associated with within-group character variability. The implications of gap-coding for character differentiation are examined, and the relationship between increasing sample size and decreasing information content of gap-coded characters is reviewed. The technique of generalized gap-coding is proposed to eliminate the problems apparent with simple gap-coding. Homogeneous-subset coding is developed in more detail than previously and compared to generalized gap-coding. The two techniques differ only in the critical values used to establish homogeneous or discriminant groups. These differences are most pronounced with unequal sample sizes for taxa. For moderate to large sample sizes, homogeneous-subset coding will detect more differentiation than generalized gap-coding for most choices of critical gap size. Seven data sets, representing a range of number of taxa and characters, as well as types of taxonomic units, were coded following the three procedures (3, 4, and 5, above), and phylogenetic trees were constructed. The results are discussed with regard to the amount of useful information retained by the coding procedures, the number of minimum-length trees found, and character consistency of the characters on the constructed trees. Consistency levels for all the coding methods were low and inversely related to the number of taxa present in the study. Finally, character-coding methods for phylogenetic analysis are discussed with regard to implications from the Kluge-Kerfoot phenomenon.