GPCRpred: an SVM-based method for prediction of families and subfamilies of G-protein coupled receptors
Open Access
- 1 July 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 32 (Web Server) , W383-W389
- https://doi.org/10.1093/nar/gkh416
Abstract
G-protein coupled receptors (GPCRs) belong to one of the largest superfamilies of membrane proteins and are important targets for drug design. In this study, a support vector machine (SVM)-based method, GPCRpred, has been developed for predicting families and subfamilies of GPCRs from the dipeptide composition of proteins. The dataset used in this study for training and testing was obtained from http://www.soe.ucsc.edu/research/compbio/gpcr/. The method classified GPCRs and non-GPCRs with an accuracy of 99.5% when evaluated using 5-fold cross-validation. The method is further able to predict five major classes or families of GPCRs with an overall Matthew's correlation coefficient (MCC) and accuracy of 0.81 and 97.5% respectively. In recognizing the subfamilies of the rhodopsin-like family, the method achieved an average MCC and accuracy of 0.97 and 97.3% respectively. The method achieved overall accuracy of 91.3% and 96.4% at family and subfamily level respectively when evaluated on an independent/blind dataset of 650 GPCRs. A server for recognition and classification of GPCRs based on multiclass SVMs has been set up at http://www.imtech.res.in/raghava/gpcrpred/. We have also suggested subfamilies for 42 sequences which were previously identified as unclassified ClassA GPCRs. The supplementary information is available at http://www.imtech.res.in/raghava/gpcrpred/info.html.Keywords
This publication has 12 references indexed in Scilit:
- Classification of Nuclear Receptors Based on Amino Acid Composition and Dipeptide CompositionJournal of Biological Chemistry, 2004
- Proteome-wide classification and identification of mammalian-type GPCRs by binary topology patternComputational Biology and Chemistry, 2004
- Automated generation and refinement of protein signatures: case study with G-protein coupled receptorsBioinformatics, 2003
- A study on the correlation of G-protein-coupled receptor types with amino acid compositionProtein Engineering, Design and Selection, 2002
- Classifying G-protein coupled receptors with support vector machinesBioinformatics, 2002
- Deriving structural and functional insights from a ligand-based hierarchical classification of G protein-coupled receptorsProtein Engineering, Design and Selection, 2002
- Support vector machine approach for protein subcellular localization predictionBioinformatics, 2001
- Collecting and harvesting biological data: the GPCRDB and NucleaRDB information systemsNucleic Acids Research, 2001
- A Discriminative Framework for Detecting Remote Protein HomologiesJournal of Computational Biology, 2000
- The DEF data base of sequence based protein fold class predictions.1994