Prediction of protein subcellular localization by support vector machines using multi-scale energy and pseudo amino acid composition
- 19 January 2007
- journal article
- research article
- Published by Springer Nature in Amino Acids
- Vol. 33 (1) , 69-74
- https://doi.org/10.1007/s00726-006-0475-y
Abstract
As more and more genomes have been discovered in recent years, there is an urgent need to develop a reliable method to predict the subcellular localization for the explosion of newly found proteins. However, many well-known prediction methods based on amino acid composition have problems utilizing the sequence-order information. Here, based on the concept of Chou’s pseudo amino acid composition (PseAA), a new feature extraction method, the multi-scale energy (MSE) approach, is introduced to incorporate the sequence-order information. First, a protein sequence was mapped to a digital signal using the amino acid index. Then, by wavelet transform, the mapped signal was broken down into several scales in which the energy factors were calculated and further formed into an MSE feature vector. Following this, combining this MSE feature vector with amino acid composition (AA), we constructed a series of MSEPseAA feature vectors to represent the protein subcellular localization sequences. Finally, according to a new kind of normalization approach, the MSEPseAA feature vectors were normalized to form the improved MSEPseAA vectors, named as IEPseAA. Using the technique of IEPseAA, C-support vector machine (C-SVM) and three multi-class SVMs strategies, quite promising results were obtained, indicating that MSE is quite effective in reflecting the sequence-order effects and might become a useful tool for predicting the other attributes of proteins as well.Keywords
This publication has 35 references indexed in Scilit:
- Hum-PLoc: A novel ensemble classifier for predicting human protein subcellular localizationBiochemical and Biophysical Research Communications, 2006
- Fuzzy KNN for predicting membrane protein types from pseudo-amino acid compositionJournal of Theoretical Biology, 2006
- Prediction of protein structural classes using support vector machinesAmino Acids, 2006
- Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid compositionBiochemical and Biophysical Research Communications, 2005
- An application of gene comparative image for predicting the effect on replication ratio by HBV virus gene missense mutationJournal of Theoretical Biology, 2005
- Using optimized evidence-theoretic K-nearest neighbor classifier and pseudo-amino acid composition to predict membrane protein typesBiochemical and Biophysical Research Communications, 2005
- Using supervised fuzzy clustering to predict protein structural classesBiochemical and Biophysical Research Communications, 2005
- A comparison of methods for multiclass support vector machinesIEEE Transactions on Neural Networks, 2002
- Prediction of protein cellular attributes using pseudo‐amino acid compositionProteins-Structure Function and Bioinformatics, 2001
- Discrimination of Intracellular and Extracellular Proteins Using Amino Acid Composition and Residue-pair FrequenciesJournal of Molecular Biology, 1994