Transmembrane Topology and Signal Peptide Prediction Using Dynamic Bayesian Networks
Open Access
- 7 November 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 4 (11) , e1000213
- https://doi.org/10.1371/journal.pcbi.1000213
Abstract
Hidden Markov models (HMMs) have been successfully applied to the tasks of transmembrane protein topology prediction and signal peptide prediction. In this paper we expand upon this work by making use of the more powerful class of dynamic Bayesian networks (DBNs). Our model, Philius, is inspired by a previously published HMM, Phobius, and combines a signal peptide submodel with a transmembrane submodel. We introduce a two-stage DBN decoder that combines the power of posterior decoding with the grammar constraints of Viterbi-style decoding. Philius also provides protein type, segment, and topology confidence metrics to aid in the interpretation of the predictions. We report a relative improvement of 13% over Phobius in full-topology prediction accuracy on transmembrane proteins, and a sensitivity and specificity of 0.96 in detecting signal peptides. We also show that our confidence metrics correlate well with the observed precision. In addition, we have made predictions on all 6.3 million proteins in the Yeast Resource Center (YRC) database. This large-scale study provides an overall picture of the relative numbers of proteins that include a signal-peptide and/or one or more transmembrane segments as well as a valuable resource for the scientific community. All DBNs are implemented using the Graphical Models Toolkit. Source code for the models described here is available at http://noble.gs.washington.edu/proj/philius. A Philius Web server is available at http://www.yeastrc.org/philius, and the predictions on the YRC database are available at http://www.yeastrc.org/pdr. Transmembrane proteins control the flow of information and substances into and out of the cell and are involved in a broad range of biological processes. Their interfacing role makes them rewarding drug targets, and it is estimated that more than 50% of recently launched drugs target membrane proteins. However, experimentally determining the three-dimensional structure of a transmembrane protein is still a difficult task, and few of the currently known tertiary structures are of transmembrane proteins despite the fact that as many as one quarter of the proteins in a given organism are transmembrane proteins. Computational methods for predicting the basic topology of a transmembrane protein are therefore of great interest, and these methods must be able to distinguish between mature, membrane-spanning proteins and proteins that, when first synthesized, contain an N-terminal membrane-spanning signal peptide. In this work, we present Philius, a new computational approach that outperforms previous methods in simultaneously detecting signal peptides and correctly predicting the topology of transmembrane proteins. Philius also supplies a set of confidence scores with each prediction. A Philius Web server is available to the public as well as precomputed predictions for over six million proteins in the Yeast Resource Center database.Keywords
This publication has 42 references indexed in Scilit:
- MemBrain: Improving the Accuracy of Predicting Transmembrane HelicesPLOS ONE, 2008
- Prediction of membrane-protein topology from first principlesProceedings of the National Academy of Sciences, 2008
- PROTEUS2: a web server for comprehensive protein structure prediction and structure-based annotationNucleic Acids Research, 2008
- A dynamic Bayesian network approach to protein secondary structure predictionBMC Bioinformatics, 2008
- A Primer on Learning in Bayesian Networks for Computational BiologyPLoS Computational Biology, 2007
- Advantages of combined transmembrane topology and signal peptide prediction--the Phobius web serverNucleic Acids Research, 2007
- A global topology map of the Saccharomyces cerevisiae membrane proteomeProceedings of the National Academy of Sciences, 2006
- PONGO: a web server for multiple predictions of all-alpha transmembrane proteinsNucleic Acids Research, 2006
- Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. CohenJournal of Molecular Biology, 2001
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997