Prediction of protein relative solvent accessibility with support vector machines and long‐range interaction 3D local descriptor
- 12 December 2003
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 54 (3) , 557-562
- https://doi.org/10.1002/prot.10602
Abstract
The prediction of protein relative solvent accessibility gives us helpful information for the prediction of tertiary structure of a protein. The SVMpsi method, which uses support vector machines (SVMs), and the position‐specific scoring matrix (PSSM) generated from PSI‐BLAST have been applied to achieve better prediction accuracy of the relative solvent accessibility. We have introduced a three‐dimensional local descriptor that contains information about the expected remote contacts by both the long‐range interaction matrix and neighbor sequences. Moreover, we applied feature weights to kernels in SVMs in order to consider the degree of significance that depends on the distance from the specific amino acid. Relative solvent accessibility based on a two state‐model, for 25%, 16%, 5%, and 0% accessibility are predicted at 78.7%, 80.7%, 82.4%, and 87.4% accuracy, respectively. Three‐state prediction results provide a 64.5% accuracy with 9%; 36% threshold. The support vector machine approach has successfully been applied for solvent accessibility prediction by considering long‐range interaction and handling unbalanced data. Proteins 2004;54:000–000.Keywords
Funding Information
- National Science Foundation (CCR-0204109, ACI-0305543)
This publication has 43 references indexed in Scilit:
- Getting the most from PSI–BLASTPublished by Elsevier ,2002
- A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach1 1Edited by B. HollandJournal of Molecular Biology, 2001
- The Protein Data BankNucleic Acids Research, 2000
- Protein secondary structure prediction based on position-specific scoring matrices 1 1Edited by G. Von HeijneJournal of Molecular Biology, 1999
- Stabilization centers in proteins:Identification, characterization and predictionsJournal of Molecular Biology, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- A Model Recognition Approach to the Prediction of All-Helical Membrane Protein Structure and TopologyBiochemistry, 1994
- Prediction of Protein Secondary Structure at Better than 70% AccuracyJournal of Molecular Biology, 1993
- Predicting the secondary structure of globular proteins using neural network modelsJournal of Molecular Biology, 1988
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983