Prediction of protein structural classes using support vector machines
- 20 April 2006
- journal article
- research article
- Published by Springer Nature in Amino Acids
- Vol. 30 (4) , 469-475
- https://doi.org/10.1007/s00726-005-0239-0
Abstract
Summary. The support vector machine, a machine-learning method, is used to predict the four structural classes, i.e. mainly α, mainly β, α–β and fss, from the topology-level of CATH protein structure database. For the binary classification, any two structural classes which do not share any secondary structure such as α and β elements could be classified with as high as 90% accuracy. The accuracy, however, will decrease to less than 70% if the structural classes to be classified contain structure elements in common. Our study also shows that the dimensions of feature space 202 = 400 (for dipeptide) and 203 = 8 000 (for tripeptide) give nearly the same prediction accuracy. Among these 4 structural classes, multi-class classification gives an overall accuracy of about 52%, indicating that the multi-class classification technique in support of vector machines may still need to be further improved in future investigation.Keywords
This publication has 51 references indexed in Scilit:
- Ten thousand interactions for the molecular biologistNature Biotechnology, 2004
- A new method for multiclass support vector machinesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2004
- Prediction of β-turns with learning machinesPeptides, 2003
- Support Vector Machines for Prediction of Protein Domain Structural ClassJournal of Theoretical Biology, 2003
- The Protein Data BankActa Crystallographica Section D-Biological Crystallography, 2002
- Support vector machines for predicting the specificity of GalNAc-transferasePeptides, 2002
- Prediction of protein structural classes by support vector machinesComputers & Chemistry, 2002
- Support Vector Machines for Prediction of Protein Subcellular LocationMolecular Cell Biology Research Communications, 2000
- Knowledge-based analysis of microarray gene expression data by using support vector machinesProceedings of the National Academy of Sciences, 2000
- Principles that Govern the Folding of Protein ChainsScience, 1973