Prediction of protein accessible surface areas by support vector regression
- 29 July 2004
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 57 (3) , 558-564
- https://doi.org/10.1002/prot.20234
Abstract
A novel support vector regression (SVR) approach is proposed to predict protein accessible surface areas (ASAs) from their primary structures. In this work, we predict the real values of ASA in squared angstroms for residues instead of relative solvent accessibility. Based on protein residues, the mean and median absolute errors are 26.0 Å2 and 18.87 Å2, respectively. The correlation coefficient between the predicted and observed ASAs is 0.66. Cysteine is the best predicted amino acid (mean absolute error is 13.8 Å2 and median absolute error is 8.37 Å2), while arginine is the least predicted amino acid (mean absolute error is 42.7 Å2 and median absolute error is 36.31 Å2). Our work suggests that the SVR approach can be directly applied to the ASA prediction where data preclassification has been used. Proteins 2004.Keywords
This publication has 23 references indexed in Scilit:
- Prediction of protein relative solvent accessibility with support vector machines and long‐range interaction 3D local descriptorProteins-Structure Function and Bioinformatics, 2003
- Improvement in prediction of solvent accessibility by probability profilesProtein Engineering, Design and Selection, 2003
- Enriching the sequence substitution matrix by structural informationProteins-Structure Function and Bioinformatics, 2003
- Analysis of accessible surface of residues in proteinsProtein Science, 2003
- Real value prediction of solvent accessibility from amino acid sequenceProteins-Structure Function and Bioinformatics, 2003
- Prediction of protein solvent accessibility using support vector machinesProteins-Structure Function and Bioinformatics, 2002
- The Protein Data BankNucleic Acids Research, 2000
- Adaptation of protein surfaces to subcellular location 1 1Edited by F. E. CohenJournal of Molecular Biology, 1998
- Predicting surface exposure of amino acids from protein sequenceProtein Engineering, Design and Selection, 1990
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983