Abstract
Existing methods of identifying the cleavage site of the nascent polypeptide and the C-terminal residue to which the glycosylphosphatidylinositol (GPI) anchor is attached in mature GPI-anchored proteins are technically difficult and labour-intensive. We tested the hypothesis that it was possible to predict this locus using data from the cDNA-deduced amino acid sequence and amino acid composition of GPI-anchored proteins. We employed a statistical approach which allowed repeated chi 2 comparisons between the proportions of residual amino acids in the major body of the cDNA-deduced polypeptide (minus the N-terminal signal peptide) after repeated computer-generated progressive exoproteolysis from its C-terminus one amino acid at a time and the fixed proportion of amino acids obtained from amino acid analysis of the mature GPI-anchored protein. Initial comparison of the two parameters invariably revealed a relatively high chi 2 statistic which progressively lowered to a minimum point at which the amino acid proportions of progressively exoproteolysed polypeptide and fixed endoproteolysed polypeptides of the mature GPI-anchored protein were in closest agreement. This objectively defined and unique minimum point of closest agreement accurately identified the locus of post-translational endoproteolytic cleavage of the nascent polypeptide in several tissue-specific single-gene-encoded GPI-anchored proteins. Thus the C-terminal amino acid to which the GPI anchor is attached can be rapidly identified using data from the cDNA sequence and the amino acid composition of proteins suspected to be GPI-anchored.