The Metric Properties of DNA-DNA Hybridization Dissimilarity Measures

Abstract
Under a simple model of DNA sequence dissociation, DNA-DNA hybridization dissimilarity measures are expected to behave like the complement of Gower''s coefficient of similarity, calculated for each pair of taxonomic units (j, k) across i homologues (nucleotide positions) with four possible states, multiplied by a constant base-pair mismatch to the temperature of DNA dissociation. DNA dissimilarity measures that include information only from homologues, therefore, are expected to conform to the axions of a metric. Empirically, we have found that measures of dissimilarity between single-copy nuclear genomes regularly conform to 3 of the 4 metric axioms: identity, distinctness, and the triangle inequalilty. However, reciprocal values frequently fail to conform to the fourth axiom, symmetry. For some measures dependent upon the percent of reassociation (e.g., delta T50H), failure of symmetry can occur when sequences present in one genome have no homologue in other genomes (e.g., as a result of sequence deletion or lateral transfer). For all measures, including delta model and delta Tm, the presence of sequence in one genome that lack a homologue in other genomes differentially affects the effective concentration of shared sequences in reciprocal reactions, causing reciprocal measurements to represent comparisons of different sets of sequences. This effect also potentially obtains under certain conditions of sequence duplication and divergence.