Virus‐PLoc: A fusion classifier for predicting the subcellular localization of viral proteins within host and virus‐infected cells
- 21 November 2006
- journal article
- research article
- Published by Wiley in Biopolymers
- Vol. 85 (3) , 233-240
- https://doi.org/10.1002/bip.20640
Abstract
Viruses can reproduce their progenies only within a host cell, and their actions depend both on its destructive tendencies toward a specific host cell and on environmental conditions. Therefore, knowledge of the subcellular localization of viral proteins in a host cell or virus‐infected cell is very useful for in‐depth studying of their functions and mechanisms as well as designing antiviral drugs. An analysis on the Swiss‐Prot database (version 50.0, released on May 30, 2006) indicates that only 23.5% of viral protein entries are annotated for their subcellular locations in this regard. As for the gene ontology database, the corresponding percentage is 23.8%. Such a gap calls for the development of high throughput tools for timely annotating the localization of viral proteins within host and virus‐infected cells. In this article, a predictor called “Virus‐PLoc” has been developed that is featured by fusing many basic classifiers with each engineered according to the K‐nearest neighbor rule. The overall jackknife success rate obtained by Virus‐PLoc in identifying the subcellular compartments of viral proteins was 80% for a benchmark dataset in which none of proteins has more than 25% sequence identity to any other in a same location site. Virus‐PLoc will be freely available as a web‐server at http://202.120.37.186/bioinf/virus for the public usage. Furthermore, Virus‐PLoc has been used to provide large‐scale predictions of all viral protein entries in Swiss‐Prot database that do not have subcellular location annotations or are annotated as being uncertain. The results thus obtained have been deposited in a downloadable file prepared with Microsoft Excel and named “Tab_Virus‐PLoc.xls.” This file is available at the same website and will be updated twice a year to include the new entries of viral proteins and reflect the continuous development of Virus‐PLoc. © 2006 Wiley Periodicals, Inc. Biopolymers 85: 233–240, 2007. This article was originally published online as an accepted preprint. The “Published Online” date corresponds to the preprint version. You can request a copy of the preprint by emailing the Biopolymers editorial office at biopolymers@wiley.comKeywords
This publication has 53 references indexed in Scilit:
- Automatic transcription factor classifier based on functional domain compositionBiochemical and Biophysical Research Communications, 2006
- Prediction of protein structural classes using support vector machinesAmino Acids, 2006
- Predicting rRNA-, RNA-, and DNA-binding proteins from primary structure with support vector machinesPublished by Elsevier ,2005
- Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid compositionBiochemical and Biophysical Research Communications, 2005
- Corrigendum to “Predicting protein structural class by functional domain composition” [Biochem. Biophys. Res. Commun. 321 (2004) 1007–1009]Biochemical and Biophysical Research Communications, 2005
- SLLE for predicting membrane protein typesJournal of Theoretical Biology, 2005
- UniProt: the Universal Protein knowledgebaseNucleic Acids Research, 2004
- Prediction of protein cellular attributes using pseudo‐amino acid compositionProteins-Structure Function and Bioinformatics, 2001
- A Key Driving Force in Determination of Protein Structural ClassesBiochemical and Biophysical Research Communications, 1999
- Discrimination of Intracellular and Extracellular Proteins Using Amino Acid Composition and Residue-pair FrequenciesJournal of Molecular Biology, 1994