Classification of Proteins into Groups Based on Amino Acid Composition and Other Characters. II. Grouping into Four Types
- 1 July 1983
- journal article
- research article
- Published by Oxford University Press (OUP) in The Journal of Biochemistry
- Vol. 94 (3) , 997-1007
- https://doi.org/10.1093/oxfordjournals.jbchem.a134443
Abstract
Correlations of the amino acid composition of a protein to its location in an organism, biological function, folding type, and disulfide bond(s) were examined for 356 proteins. In the present data set, 325 proteins of known location and biological characters were divided into 122 intracellular enzymes (BI), 73 intracellular non-enzymes (BIT), 45 extracellular enzymes (Bill), and 85 extracellular nonenzymes (BIV). The composition of these proteins were expressed as points in the composition space of 18 orthogonal axes, each representing the content of an amino acid. The distributions of points of BI and Bill were narrow and approximately spherical but those of BII and BIV were distributed rather widely. The groups are separated from each other in the space. We divided the space into four regions (Al to A4) corresponding to the groups BI to BIV. A protein could be assigned to one of the four groups (Al to A4) from its amino acid composition: The proteins correctly assigned amounted to 177 out of 195 intracellular proteins, and 94 out of 130 extracellular proteins. The correspondence was about 80% for classification into intracellular and extracellular proteins and 66% for that into the four groups. The folding type also had a significant correlation to the above groups, i.e., intracellular enzymes are rich in α/β, nonenzymes α, extracellular enzymes β and α+β, and nonenzymes β. The differences in average composition between intra- and extracellular proteins, and between enzymes and nonenzymes were related to the structural characters, i.e., intracellular proteins contain more amino acids favoring α-helix than extracellular ones, and enzymes contain more hydrophobic amino acids than nonenzymes. The statistics on 213 Cys-containing proteins showed that disulfide bond(s) are found mostly (90%) in the extracellular proteins. The results indicate that amino acid composition is well correlated to location in an organism, biological function, folding type, and disulfide bonding. The implications of the new findings are discussed from the protein-taxonomical point of view, and the validity of the present method is assessed.Keywords
This publication has 10 references indexed in Scilit:
- Classification of Proteins into Groups Based on Amino Acid Composition and Other Characters. I. Angular DistributionThe Journal of Biochemistry, 1983
- Specific protein-nucleic acid recognition in ribonuclease T1–2′-guanylic acid complex: an X-ray studyNature, 1982
- The operator-binding domain of λ repressor: structure and DNA recognitionNature, 1982
- Crystallographic refinement and atomic models of two different forms of citrate synthase at 2·7 and 1·7 Å resolutionJournal of Molecular Biology, 1982
- Correlation of the Amino Acid Composition of a Protein to Its Structural and Biological Characters1The Journal of Biochemistry, 1982
- Correspondence of homologies in amino acid sequence and tertiary structure of protein moleculesBiochimica et Biophysica Acta (BBA) - Protein Structure and Molecular Enzymology, 1982
- Structure of alkaline phosphatase with zinc/magnesium cobalt or cadmium in the functional metal sitesJournal of Molecular Biology, 1981
- Crystallization, crystal structure analysis and molecular model of the third domain of Japanese quail ovomucoid, a Kazal type inhibitorJournal of Molecular Biology, 1981
- Structural patterns in globular proteinsNature, 1976
- A study of the correlation between the amino acid composition and the helical content of proteinsJournal of Theoretical Biology, 1966