Prediction of the bonding states of cysteines Using the support vector machines based on multiple feature vectors and cysteine state sequences
- 16 April 2004
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 55 (4) , 1036-1042
- https://doi.org/10.1002/prot.20079
Abstract
The support vector machine (SVM) method is used to predict the bonding states of cysteines. Besides using local descriptors such as the local sequences, we include global information, such as amino acid compositions and the patterns of the states of cysteines (bonded or nonbonded), or cysteine state sequences, of the proteins. We found that SVM based on local sequences or global amino acid compositions yielded similar prediction accuracies for the data set comprising 4136 cysteine‐containing segments extracted from 969 nonhomologous proteins. However, the SVM method based on multiple feature vectors (combining local sequences and global amino acid compositions) significantly improves the prediction accuracy, from 80% to 86%. If coupled with cysteine state sequences, SVM based on multiple feature vectors yields 90% in overall prediction accuracy and a 0.77 Matthews correlation coefficient, around 10% and 22% higher than the corresponding values obtained by SVM based on local sequence information. Proteins 2004;55:000–000.Keywords
This publication has 37 references indexed in Scilit:
- Fine-grained protein fold assignment by support vector machines using generalized npeptide coding schemes and jury voting from multiple-parameter setsProteins-Structure Function and Bioinformatics, 2003
- A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach1 1Edited by B. HollandJournal of Molecular Biology, 2001
- What can Disulfide Bonds Tell Us about Protein Energetics, Function and Folding: Simulations and Bioninformatics AnalysisJournal of Molecular Biology, 2000
- The transition state in the folding-unfolding reaction of four species of three-disulfide variant of hen lysozyme: the role of each disulfide bridgeJournal of Molecular Biology, 2000
- The Protein Data BankNucleic Acids Research, 2000
- Hidden Neural NetworksNeural Computation, 1999
- The Disulphide β-Cross: From Cystine Geometry and Clustering to Classification of Small Disulphide-rich Protein FoldsJournal of Molecular Biology, 1996
- Classical Trajectory Mapping Approach for Simulations of Chemical Reactions in Solution and in EnzymesThe Journal of Physical Chemistry, 1996
- Prediction of Protein Secondary Structure at Better than 70% AccuracyJournal of Molecular Biology, 1993
- Different sequence environments of cysteines and half cystines in proteins Application to predict disulfide forming residuesFEBS Letters, 1992