Secreted protein prediction system combining CJ-SPHMM, TMHMM, and PSORT
- 1 December 2003
- journal article
- research article
- Published by Springer Nature in Mammalian Genome
- Vol. 14 (12) , 859-865
- https://doi.org/10.1007/s00335-003-2296-6
Abstract
To increase the coverage of secreted protein prediction, we describe a combination strategy. Instead of using a single method, we combine Hidden Markov Model (HMM)-based methods CJ-SPHMM and TMHMM with PSORT in secreted protein prediction. CJ-SPHMM is an HMM-based signal peptide prediction method, while TMHMM is an HMM-based transmembrane (TM) protein prediction algorithm. With CJ-SPHMM and TMHMM, proteins with predicted signal peptide and without predicted TM regions are taken as putative secreted proteins. This HMM-based approach predicts secreted protein with Ac (Accuracy) at 0.82 and Cc (Correlation coefficient) at 0.75, which are similar to PSORT with Ac at 0.82 and Cc at 0.76. When we further complement the HMM-based method, i.e., CJ-SPHMM + TMHMM with PSORT in secreted protein prediction, the Ac value is increased to 0.86 and the Cc value is increased to 0.81. Taking this combination strategy to search putative secreted proteins from the International Protein Index (IPI) maintained at the European Bioinformatics Institute (EBI), we constructed a putative human secretome with 5235 proteins. The prediction system described here can also be applied to predicting secreted proteins from other vertebrate proteomes.Keywords
This publication has 40 references indexed in Scilit:
- A profile hidden Markov model for signal peptides generated by HMMERBioinformatics, 2003
- Human secretory signal peptide description by hidden Markov model and generation of a strong artificial signal peptide for secreted protein expressionBiochemical and Biophysical Research Communications, 2002
- State-of-the-art in membrane protein prediction.2002
- Bioinformatics, target discovery and the pharmaceutical/biotechnology industry2000
- Signal Peptide-Dependent Protein Transport in Bacillus subtilis : a Genome-Based Survey of the SecretomeMicrobiology and Molecular Biology Reviews, 2000
- Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid SequenceJournal of Molecular Biology, 2000
- A hidden Markov model for predicting transmembrane helices in protein sequences.1998
- Eukaryotic protein secretionCurrent Opinion in Biotechnology, 1997
- Selection of representative protein data setsProtein Science, 1992
- Signal sequencesJournal of Molecular Biology, 1985