Determining the Overall Merit of Protein Identification Data Sets: rho-Diagrams andrho-Scores
- 31 March 2007
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Proteome Research
- Vol. 6 (5) , 1997-2004
- https://doi.org/10.1021/pr070025y
Abstract
This paper described a simple heuristic method for determining the merit of a set of peptide sequence assignments made using tandem mass spectra. The method involved comparing a prediction based on the known stochastic behavior of a sequence assignment algorithm with the assignments generated from a particular data set. A particular formulation of this comparison was defined through the construction of a plot of the data, the rho-diagram, as well as a parameter derived from this plot, the rho-score. This plot and parameter were shown to be able to readily characterize the relative quality of a set of peptide sequence assignments and to allow the straightforward determination of probability threshold values for the interpretation of proteomics data. This plot is independent of the algorithm or scoring scheme used to estimate the statistical significance of a set of experimental results; rather, it can be used as an objective test of the correctness of those estimates. The rho-score can also be used as a parameter to evaluate the relative merit of protein identifications, such as those made across proteome species taxonomic categories. Keywords: rho-score • rho-diagram • bioinformatics • protein identification • peptide identificationKeywords
This publication has 16 references indexed in Scilit:
- ProteomeCommons.org JAF: reference information and tools for proteomicsBioinformatics, 2006
- Open Source System for Analyzing, Validating, and Storing Protein Identification DataJournal of Proteome Research, 2004
- Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database SearchAnalytical Chemistry, 2002
- What does it mean to identify a protein in proteomics?Trends in Biochemical Sciences, 2002
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein databaseJournal of the American Society for Mass Spectrometry, 1994
- Letter to the editorsJournal of Mass Spectrometry, 1984
- Applications of artificial intelligence for chemical inference. III. Aliphatic ethers diagnosed by their low-resolution mass spectra and nuclear magnetic resonance dataJournal of the American Chemical Society, 1969
- Applications of artificial intelligence for chemical inference. II. Interpretation of low-resolution mass spectra of ketonesJournal of the American Chemical Society, 1969
- Applications of artificial intelligence for chemical inference. I. Number of possible organic compounds. Acyclic structures containing carbon, hydrogen, oxygen, and nitrogenJournal of the American Chemical Society, 1969