Shifting ditypic site analysis: Heuristics for expanding the phylogenetic range of nucleotide sequences in Sankoff analyses

Abstract
We describe and illustrate a simple heuristic approach to the Sankoff methods for construction of parsimonious evolutionary trees from nucleotide sequence data. The procedure is intended to permit more valid inferences, particularly from relatively short sequences, concerning relationships among taxa separated for long time intervals. The procedure is based on the freat variability of evolutionary plasticity among sites in the molecules and removes from consideration the more highly variable sites. Editing is accomplished after classifying sites in carefully aligned arrays of sequences. Only “ditypic sites,” i.e., sites observed in only two evolutionary states within the array, are used in making phylogenetic inferences. This strategy makes possible the construction of good approximations to the most parsimonious Steiner strees, by means of efficient programs that require “dense species arrays,” i.e., species sets that differ from each other by relatively small numbers of differences in conservative sites. The technique is illustrated with 5S and 5.8S rRNA sequence data from published catalogs.