Regularities in the primary structure of proteins

Abstract
In this paoper the latest protein datablase consisting of more than a million amino acids is analyzed to characterize the short range regularities in the primary structure. The amino acid distributions along the polypeptide chain and among the proteins have been studied first. Their influence on the amino acid pair statistics was taken into account. We are primarily interested in teh distances of the covalent structure, where the amino acid pair frequences show non-random characters. The amino acid pairs separated by at least 20 residues in the covalent structure exhibit an extact Guassian distribution. We found that there is a range of non-random pairing in the covalent structure. We conclude that the pair preference characters are different for each of the 20 .times. 20 amino acid pairs. The range of the non-random pairing varies from pair to pair, and in most cases it does not extend beyond the 9th neighbour. The prefences of a certain pair in a certain position can not be derived from the character of that pair in another position. The preference values of 400 amino acid pairs are listed for up to the pair s in 9th neighbour position. Some fields of potential application of these data have also been discussed.

This publication has 8 references indexed in Scilit: