MASPIC: Intensity-Based Tandem Mass Spectrometry Scoring Scheme That Improves Peptide Identification at High Confidence
- 22 October 2005
- journal article
- research article
- Published by American Chemical Society (ACS) in Analytical Chemistry
- Vol. 77 (23) , 7581-7593
- https://doi.org/10.1021/ac0501745
Abstract
Algorithmic search engines bridge the gap between large tandem mass spectrometry data sets and the identification of proteins associated with biological samples. Improvements in these tools can greatly enhance biological discovery. We present a new scoring scheme for comparing tandem mass spectra with a protein sequence database. The MASPIC (Multinomial Algorithm for Spectral Profile-based Intensity Comparison) scorer converts an experimental tandem mass spectrum into a m/z profile of probability and then scores peak lists from potential candidate peptides using a multinomial distribution model. The MASPIC scoring scheme incorporates intensity, spectral peak density variations, and m/z error distribution associated with peak matches into a multinomial distribution. The scoring scheme was validated on two standard protein mixtures and an additional set of spectra collected on a complex ribosomal protein mixture from Rhodopseudomonas palustris. The results indicate a 5−15% improvement over Sequest for high-confidence identifications. The performance gap grows as sequence database size increases. Additional tests on spectra from proteinase-K digest data showed similar performance improvements demonstrating the advantages in using MASPIC for studying proteins digested with less specific proteases. All these investigations show MASPIC to be a versatile and reliable system for peptide tandem mass spectral identification.Keywords
This publication has 25 references indexed in Scilit:
- The abc's (and xyz's) of peptide sequencingNature Reviews Molecular Cell Biology, 2004
- High-Throughput Identification of Proteins and Unanticipated Sequence Modifications Using a Mass-Based Alignment Algorithm for MS/MS de Novo Sequencing ResultsAnalytical Chemistry, 2004
- Intensity-based protein identification by machine learning from a library of tandem mass spectraNature Biotechnology, 2004
- Deriving statistical models for predicting peptide tandem MS product ion intensitiesBiochemical Society Transactions, 2003
- A Correlation Algorithm for the Automated Quantitative Analysis of Shotgun Proteomics DataAnalytical Chemistry, 2003
- Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database SearchAnalytical Chemistry, 2002
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic ScaleScience, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Error-Tolerant Identification of Peptides in Sequence Databases by Peptide Sequence TagsAnalytical Chemistry, 1994