Phylogenetic continuum indicates “Galaxies” in the protein universe: Preliminary results on the natural group structures of proteins

1 April 1992

journal article
research article
Published by Springer Nature in Journal of Molecular Evolution

Vol. 34 (4) , 358-375
https://doi.org/10.1007/bf00160244

Abstract

The markedly nonuniform, even systematic distribution of sequences in the protein “universe” has been analyzed by methods of protein taxonomy. Mapping of the natural hierarchical system of proteins has revealed some dense cores, i.e., well-defined clusterings of proteins that seem to be natural structural groupings, possibly seeds for a future protein taxonomy. The aim was not to force proteins into more or less man-made categories by discriminant analysis, but to find structurally similar groups, possibly of common evolutionary origin. Single-valued distance measures between pairs of superfamilies from the Protein Identification Resource were defined by two χ²-like methods on tripeptide frequencies and the variable-length subsequence identity method derived from dot-matrix comparisons. Distance matrices were processed by several methods of cluster analysis to detect phylogenetic continuum between highly divergent proteins. Only well-defined clusters characterized by relatively unique structural, intracellular environmental, organismal, and functional attribute states were selected as major protein groups, including subsets of viral and Escherichia coli proteins, hormones, inhibitors, plant, ribosomal, serum and structural proteins, amino acid synthases, and clusters dominated by certain oxidoreductases and apolar and DNA-associated enzymes. The limited repertoire of functional patterns due to small genome size, the high rate of recombination, specific features of the bacterial membranes, or of the virus cycle canalize certain proteins of viruses and Gram-negative bacteria, respectively, to organismal groups.

Keywords

This publication has 53 references indexed in Scilit:

Nuclear location signals in polyoma virus large-T
Cell, 1985
Phosphate‐binding sequences in nucleotide‐binding proteins
FEBS Letters, 1985
Kringles: modules specialized for protein binding
FEBS Letters, 1984
Structure of the haemagglutinin membrane glycoprotein of influenza virus at 3 Å resolution
Nature, 1981
A surprising new protein superfamily containing ovalbumin, antithrombin-III, and alpha1-proteinase inhibitor
Biochemical and Biophysical Research Communications, 1980
How reliably do amino acid composition comparisons predict sequence similarities between proteins?
Journal of Theoretical Biology, 1979
The role of guanosine 5′-triphosphate in polypeptide chain elongation
Biochimica et Biophysica Acta (BBA) - Reviews on Bioenergetics, 1978
Evolutionary processes and evolutionary noise at the molecular level
Journal of Molecular Evolution, 1976
The appearance of new structures and functions in proteins during evolution
Journal of Molecular Evolution, 1975
Sequence and structure homologies in bacterial and mammalian-type cytochromes
Journal of Molecular Biology, 1971