PepHMM: A Hidden Markov Model Based Scoring Function for Mass Spectrometry Database Search
- 13 December 2005
- journal article
- research article
- Published by American Chemical Society (ACS) in Analytical Chemistry
- Vol. 78 (2) , 432-437
- https://doi.org/10.1021/ac051319a
Abstract
An accurate scoring function for database search is crucial for peptide identification using tandem mass spectrometry. Although many mathematical models have been proposed to score peptides against tandem mass spectra, our method (called PepHMM, http://msms.cmb.usc.edu) is unique in that it combines information on machine accuracy, mass peak intensity, and correlation among ions into a hidden Markov model (HMM). In addition, we develop a method to calculate statistical significance of the HMM scores. We implement the method and test them on two sets of experimental data generated by two different types of mass spectrometers and compare the results with MASCOT and SEQUEST under the same condition. One experimental results show that PepHMM has a much higher accuracy (with 6.5% error rate) than MASCOT (with 17.4% error rate), and the other experimental results show that PepHMM identifies 43 and 31% more correct spectra than SEQUEST and MASCOT, respectively.Keywords
This publication has 40 references indexed in Scilit:
- Shotgun Protein Sequencing by Tandem Mass Spectra AssemblyAnalytical Chemistry, 2004
- Automatic Quality Assessment of Peptide Tandem Mass SpectraBioinformatics, 2004
- A New Algorithm for the Evaluation of Shotgun Peptide Sequencing in Proteomics: Support Vector Machine Classification of Peptide MS/MS Spectra and SEQUEST ScoresJournal of Proteome Research, 2002
- Error tolerant searching of uninterpreted tandem mass spectrometry dataProteomics, 2002
- Algorithms for Identifying Protein Cross-Links via Tandem Mass SpectrometryJournal of Computational Biology, 2001
- SCOPE: a probabilistic model for scoring tandem mass spectra against a peptide databaseBioinformatics, 2001
- Reducing Mass Degeneracy in SAR by MS by Stable Isotopic LabelingJournal of Computational Biology, 2001
- On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray DataJournal of Computational Biology, 2001
- De NovoPeptide Sequencing via Tandem Mass SpectrometryJournal of Computational Biology, 1999
- Role of Accurate Mass Measurement (±10 ppm) in Protein Identification Strategies Employing MS or MS/MS and Database SearchingAnalytical Chemistry, 1999