Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database Search
Top Cited Papers
- 12 September 2002
- journal article
- research article
- Published by American Chemical Society (ACS) in Analytical Chemistry
- Vol. 74 (20) , 5383-5392
- https://doi.org/10.1021/ac025747h
Abstract
We present a statistical model to estimate the accuracy of peptide assignments to tandem mass (MS/MS) spectra made by database search applications such as SEQUEST. Employing the expectation maximization algorithm, the analysis learns to distinguish correct from incorrect database search results, computing probabilities that peptide assignments to spectra are correct based upon database search scores and the number of tryptic termini of peptides. Using SEQUEST search results for spectra generated from a sample of known protein components, we demonstrate that the computed probabilities are accurate and have high power to discriminate between correctly and incorrectly assigned peptides. This analysis makes it possible to filter large volumes of MS/MS database search results with predictable false identification error rates and can serve as a common standard by which the results of different research groups are compared.Keywords
This publication has 20 references indexed in Scilit:
- Functional organization of the yeast proteome by systematic analysis of protein complexesNature, 2002
- Alternative nucleotide incision repair pathway for oxidative DNA damageNature, 2002
- Peptide Sequence Motif Analysis of Tandem MS Data with the SALSA AlgorithmAnalytical Chemistry, 2001
- Differential stable isotope labeling of peptides for quantitation and de novo sequence derivationRapid Communications in Mass Spectrometry, 2001
- Large-scale analysis of the yeast proteome by multidimensional protein identification technologyNature Biotechnology, 2001
- Mass Spectrometry in ProteomicsChemical Reviews, 2001
- Identifying the proteome: software toolsCurrent Opinion in Biotechnology, 2000
- Proteomics to study genes and genomesNature, 2000
- Sequence database searches viade novo peptide sequencing by tandem mass spectrometryRapid Communications in Mass Spectrometry, 1997
- Method to Correlate Tandem Mass Spectra of Modified Peptides to Amino Acid Sequences in the Protein DatabaseAnalytical Chemistry, 1995