Phylogenetic continuum indicates “Galaxies” in the protein universe: Preliminary results on the natural group structures of proteins
- 1 April 1992
- journal article
- research article
- Published by Springer Nature in Journal of Molecular Evolution
- Vol. 34 (4) , 358-375
- https://doi.org/10.1007/bf00160244
Abstract
The markedly nonuniform, even systematic distribution of sequences in the protein “universe” has been analyzed by methods of protein taxonomy. Mapping of the natural hierarchical system of proteins has revealed some dense cores, i.e., well-defined clusterings of proteins that seem to be natural structural groupings, possibly seeds for a future protein taxonomy. The aim was not to force proteins into more or less man-made categories by discriminant analysis, but to find structurally similar groups, possibly of common evolutionary origin. Single-valued distance measures between pairs of superfamilies from the Protein Identification Resource were defined by two χ2-like methods on tripeptide frequencies and the variable-length subsequence identity method derived from dot-matrix comparisons. Distance matrices were processed by several methods of cluster analysis to detect phylogenetic continuum between highly divergent proteins. Only well-defined clusters characterized by relatively unique structural, intracellular environmental, organismal, and functional attribute states were selected as major protein groups, including subsets of viral and Escherichia coli proteins, hormones, inhibitors, plant, ribosomal, serum and structural proteins, amino acid synthases, and clusters dominated by certain oxidoreductases and apolar and DNA-associated enzymes. The limited repertoire of functional patterns due to small genome size, the high rate of recombination, specific features of the bacterial membranes, or of the virus cycle canalize certain proteins of viruses and Gram-negative bacteria, respectively, to organismal groups.Keywords
This publication has 53 references indexed in Scilit:
- Nuclear location signals in polyoma virus large-TCell, 1985
- Phosphate‐binding sequences in nucleotide‐binding proteinsFEBS Letters, 1985
- Kringles: modules specialized for protein bindingFEBS Letters, 1984
- Structure of the haemagglutinin membrane glycoprotein of influenza virus at 3 Å resolutionNature, 1981
- A surprising new protein superfamily containing ovalbumin, antithrombin-III, and alpha1-proteinase inhibitorBiochemical and Biophysical Research Communications, 1980
- How reliably do amino acid composition comparisons predict sequence similarities between proteins?Journal of Theoretical Biology, 1979
- The role of guanosine 5′-triphosphate in polypeptide chain elongationBiochimica et Biophysica Acta (BBA) - Reviews on Bioenergetics, 1978
- Evolutionary processes and evolutionary noise at the molecular levelJournal of Molecular Evolution, 1976
- The appearance of new structures and functions in proteins during evolutionJournal of Molecular Evolution, 1975
- Sequence and structure homologies in bacterial and mammalian-type cytochromesJournal of Molecular Biology, 1971