A multivariate study of the relationship between the genetic code and the physical-chemical properties of amino acids

Abstract
The 20 naturally occurring amino acids are characterized by 20 variables: pKNH 2, pKCOOH, pI, molecular weight, substituent van der Waals volume, seven1H and13C nuclear magnetic resonance shift variables, and eight hydrophobicity-hydrophilicity scales. The 20-dimensional data set is reduced to a few new dimensions by principal components analysis. The three first principal components reveal relationships between the properties of the amino acids and the genetic code. Thus the amino acids coded for by adenosine (A), uracil (U), or cytosine (C) in their second codon position (corresponding to U, A, or G in the second anticodon position) are grouped in these components. No grouping was detected for the amino acids coded for by guanine (G) in the second codon position (corresponding to C in the second anticodon position). The results show that a relationship exists between the physical-chemical properties of the amino acids and which of the A (U), U (A), or C (G) nucleotide is used in the second codon (anticodon) position. The amino acids coded for by G (C) in the second codon (anticodon) position do not participate in this relationship.