Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes
Top Cited Papers
Open Access
- 12 August 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (1) , 10-19
- https://doi.org/10.1093/bioinformatics/bth466
Abstract
Motivation: With protein sequences entering into databanks at an explosive pace, the early determination of the family or subfamily class for a newly found enzyme molecule becomes important because this is directly related to the detailed information about which specific target it acts on, as well as to its catalytic process and biological function. Unfortunately, it is both time-consuming and costly to do so by experiments alone. In a previous study, the covariant-discriminant algorithm was introduced to identify the 16 subfamily classes of oxidoreductases. Although the results were quite encouraging, the entire prediction process was based on the amino acid composition alone without including any sequence-order information. Therefore, it is worthy of further investigation. Results: To incorporate the sequence-order effects into the predictor, the ‘amphiphilic pseudo amino acid composition’ is introduced to represent the statistical sample of a protein. The novel representation contains 20 + 2λ discrete numbers: the first 20 numbers are the components of the conventional amino acid composition; the next 2λ numbers are a set of correlation factors that reflect different hydrophobicity and hydrophilicity distribution patterns along a protein chain. Based on such a concept and formulation scheme, a new predictor is developed. It is shown by the self-consistency test, jackknife test and independent dataset tests that the success rates obtained by the new predictor are all significantly higher than those by the previous predictors. The significant enhancement in success rates also implies that the distribution of hydrophobicity and hydrophilicity of the amino acid residues along a protein chain plays a very important role to its structure and function. Contact:kchou@san.rr.comKeywords
This publication has 32 references indexed in Scilit:
- Predicting enzyme family class in a hybridization spaceProtein Science, 2004
- Prediction of protein cellular attributes using pseudo‐amino acid compositionProteins-Structure Function and Bioinformatics, 2001
- A Key Driving Force in Determination of Protein Structural ClassesBiochemical and Biophysical Research Communications, 1999
- Relation between amino acid composition and cellular location of proteinsJournal of Molecular Biology, 1997
- Prediction of Protein Structural ClassesCritical Reviews in Biochemistry and Molecular Biology, 1995
- Discrimination of Intracellular and Extracellular Proteins Using Amino Acid Composition and Residue-pair FrequenciesJournal of Molecular Biology, 1994
- A Joint Prediction of the Folding Types of 1490 Human Proteins from their Genetic CodonsJournal of Theoretical Biology, 1993
- Improvements in protein secondary structure prediction by an enhanced neural networkJournal of Molecular Biology, 1990
- Prediction of protein structural class by discriminant analysisBiochimica et Biophysica Acta (BBA) - Protein Structure and Molecular Enzymology, 1986
- Prediction of protein antigenic determinants from amino acid sequences.Proceedings of the National Academy of Sciences, 1981