PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria
Open Access
- 1 July 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 31 (13) , 3613-3617
- https://doi.org/10.1093/nar/gkg602
Abstract
Automated prediction of bacterial protein subcellular localization is an important tool for genome annotation and drug discovery. PSORT has been one of the most widely used computational methods for such bacterial protein analysis; however, it has not been updated since it was introduced in 1991. In addition, neither PSORT nor any of the other computational methods available make predictions for all five of the localization sites characteristic of Gram-negative bacteria. Here we present PSORT-B, an updated version of PSORT for Gram-negative bacteria, which is available as a web-based application at http://www.psort.org. PSORT-B examines a given protein sequence for amino acid composition, similarity to proteins of known localization, presence of a signal peptide, transmembrane alpha-helices and motifs corresponding to specific localizations. A probabilistic method integrates these analyses, returning a list of five possible localization sites with associated probability scores. PSORT-B, designed to favor high precision (specificity) over high recall (sensitivity), attained an overall precision of 97% and recall of 75% in 5-fold cross-validation tests, using a dataset we developed of 1443 proteins of experimentally known localization. This dataset, the largest of its kind, is freely available, along with the PSORT-B source code (under GNU General Public License).Keywords
This publication has 18 references indexed in Scilit:
- Sequence conserved for subcellular localizationProtein Science, 2002
- Extensive feature detection of N-terminal protein sorting signalsBioinformatics, 2002
- Predicting protein subcellular localisation from amino acid sequence informationBriefings in Bioinformatics, 2002
- The HMMTOP transmembrane topology prediction serverBioinformatics, 2001
- Support vector machine approach for protein subcellular localization predictionBioinformatics, 2001
- A comparison of signal sequence prediction methods using a test set of signal peptidesBioinformatics, 2000
- Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid SequenceJournal of Molecular Biology, 2000
- The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000Nucleic Acids Research, 2000
- Signals for Protein Targeting into and across MembranesPublished by Springer Nature ,1994
- Expert system for predicting protein localization sites in gram‐negative bacteriaProteins-Structure Function and Bioinformatics, 1991