Prediction of protein homo-oligomer types by pseudo amino acid composition: Approached with an improved feature extraction and Naive Bayes Feature Fusion
- 15 May 2006
- journal article
- research article
- Published by Springer Nature in Amino Acids
- Vol. 30 (4) , 461-468
- https://doi.org/10.1007/s00726-006-0263-8
Abstract
The interaction of non-covalently bound monomeric protein subunits forms oligomers. The oligomeric proteins are superior to the monomers within the scope of functional evolution of biomacromolecules. Such complexes are involved in various biological processes, and play an important role. It is highly desirable to predict oligomer types automatically from their sequence. Here, based on the concept of pseudo amino acid composition, an improved feature extraction method of weighted auto-correlation function of amino acid residue index and Naive Bayes multi-feature fusion algorithm is proposed and applied to predict protein homo-oligomer types. We used the support vector machine (SVM) as base classifiers, in order to obtain better results. For example, the total accuracies of A, B, C, D and E sets based on this improved feature extraction method are 77.63, 77.16, 76.46, 76.70 and 75.06% respectively in the jackknife test, which are 6.39, 5.92, 5.22, 5.46 and 3.82% higher than that of G set based on conventional amino acid composition method with the same SVM. Comparing with Chou’s feature extraction method of incorporating quasi-sequence-order effect, our method can increase the total accuracy at a level of 3.51 to 1.01%. The total accuracy improves from 79.66 to 80.83% by using the Naive Bayes Feature Fusion algorithm. These results show: 1) The improved feature extraction method is effective and feasible, and the feature vectors based on this method may contain more protein quaternary structure information and appear to capture essential information about the composition and hydrophobicity of residues in the surface patches that buried in the interfaces of associated subunits; 2) Naive Bayes Feature Fusion algorithm and SVM can be referred as a powerful computational tool for predicting protein homo-oligomer types.Keywords
This publication has 58 references indexed in Scilit:
- A new hybrid approach to predict subcellular localization of proteins by incorporating gene ontologyBiochemical and Biophysical Research Communications, 2003
- Prediction of protein cellular attributes using pseudo‐amino acid compositionProteins-Structure Function and Bioinformatics, 2001
- Prediction of Protein Subcellular Locations by Incorporating Quasi-Sequence-Order EffectBiochemical and Biophysical Research Communications, 2000
- Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid SequenceJournal of Molecular Biology, 2000
- Analysis of protein-protein interaction sites using surface patches 1 1Edited by G.Von HeijneJournal of Molecular Biology, 1997
- Prediction of protein-protein interaction sites using patch analysis 1 1Edited by G. von HeijneJournal of Molecular Biology, 1997
- Discrimination of Intracellular and Extracellular Proteins Using Amino Acid Composition and Residue-pair FrequenciesJournal of Molecular Biology, 1994
- Prediction of Protein Secondary Structure at Better than 70% AccuracyJournal of Molecular Biology, 1993
- Predicting protein secondary structure content: A tandem neural network approachJournal of Molecular Biology, 1992
- Predicting the secondary structure of globular proteins using neural network modelsJournal of Molecular Biology, 1988