A measure of the similarity of sets of sequences not requiring sequence alignment.
- 1 July 1986
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 83 (14) , 5155-5159
- https://doi.org/10.1073/pnas.83.14.5155
Abstract
Determination of first- and second-order Markov chain homogeneity of sets of nuclear eukaryotic DNA sequences, both coding and noncoding, finds similarities imperceptible to the standard Needleman-Wunsch base matching or dot-matrix algorithms. These measures of the similarities of the distributions of adjacent pairs or triplets are in agreement with accepted evolutionary-tree topologies. Hierarchical clustering of the distributions of doublets of 30 miscellaneous coding sequences gives clusters in reasonable agreement with accepted biological classifications. In addition to similarity by homology, there is also observed similarity of disparate genes in the same organism--for example, all three disparate yeast genes (two enzymes and actin) form a well-distinguished cluster.This publication has 47 references indexed in Scilit:
- Choice of base at silent codon site 3 is not selectively neutral in eucaryotic structural genes: It maintains excess short runs of weak and strong hydrogen bonding basesJournal of Molecular Evolution, 1983
- Human leukocyte interferon produced by E. coli is biologically activeNature, 1980
- Complete nucleotide sequence of the human δ-globin geneCell, 1980
- DNA methylation and the frequency of CpG in animal DNANucleic Acids Research, 1980
- Comparison of Total Sequence of a Cloned Rabbit β-Globin Gene and Its Flanking Regions with a Homologous Mouse SequenceScience, 1979
- The structure and evolution of the two nonallelic rat preproinsulin genesCell, 1979
- Sequence of three introns in the chick ovalbumin geneNature, 1979
- The DNA sequence of sea urchin (S. purpuratus) H2A, H2B and H3 histone coding and spacer regionsCell, 1978
- The appearance of new structures and functions in proteins during evolutionJournal of Molecular Evolution, 1975
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970