Incorporating hidden Markov models for identifying protein kinase‐specific phosphorylation sites
- 11 May 2005
- journal article
- research article
- Published by Wiley in Journal of Computational Chemistry
- Vol. 26 (10) , 1032-1041
- https://doi.org/10.1002/jcc.20235
Abstract
Protein phosphorylation, which is an important mechanism in posttranslational modification, affects essential cellular processes such as metabolism, cell signaling, differentiation, and membrane transportation. Proteins are phosphorylated by a variety of protein kinases. In this investigation, we develop a novel tool to computationally predict catalytic kinase‐specific phosphorylation sites. The known phosphorylation sites from public domain data sources are categorized by their annotated protein kinases. Based on the concepts of profile Hidden Markov Models (HMM), computational models are trained from the kinase‐specific groups of phosphorylation sites. After evaluating the trained models, we select the model with highest accuracy in each kinase‐specific group and provide a Web‐based prediction tool for identifying protein phosphorylation sites. The main contribution here is that we have developed a kinase‐specific phosphorylation site prediction tool with both high sensitivity and specificity. © 2005 Wiley Periodicals, Inc. J Comput Chem 26: 1032–1041, 2005Keywords
This publication has 12 references indexed in Scilit:
- WebLogo: A Sequence Logo Generator: Figure 1Genome Research, 2004
- The importance of intrinsic disorder for protein phosphorylationNucleic Acids Research, 2004
- Reduced bio basis function neural network for identification of protein phosphorylation sites: comparison with pattern recognition algorithmsComputational Biology and Chemistry, 2004
- The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003Nucleic Acids Research, 2003
- Sequence and structure-based prediction of eukaryotic protein phosphorylation sitesJournal of Molecular Biology, 1999
- PhosphoBase: a database of phosphorylation sitesNucleic Acids Research, 1998
- The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998Nucleic Acids Research, 1998
- Profile hidden Markov models.Bioinformatics, 1998
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- Sequence logos: a new way to display consensus sequencesNucleic Acids Research, 1990