RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
Open Access
- 25 October 2007
- journal article
- Published by Springer Nature in Biology Direct
- Vol. 2 (1) , 25
- https://doi.org/10.1186/1745-6150-2-25
Abstract
The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request.Keywords
This publication has 25 references indexed in Scilit:
- Calibrating E-values for MS2 database search methodsBiology Direct, 2007
- Identification and Quantification of Basic and Acidic Proteins Using Solution-Based Two-Dimensional Protein Fractionation and Label-Free or 18O-Labeling Mass SpectrometryJournal of Proteome Research, 2007
- Central Limit Theorem as an Approximation for Intensity-Based Scoring FunctionAnalytical Chemistry, 2005
- Robust accurate identification of peptides (RAId): deciphering MS2 data using a structured library search with de novo based statisticsBioinformatics, 2005
- Identification of tryptic peptides from large databases using multiplexed tandem mass spectrometry: simulations and experimental resultsProteomics, 2003
- Popitam: Towards new heuristic strategies to improve protein identification from tandem mass spectrometry dataProteomics, 2003
- Experimental Protein Mixture for Validating Tandem Mass Spectral AnalysisOMICS: A Journal of Integrative Biology, 2002
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- Error-Tolerant Identification of Peptides in Sequence Databases by Peptide Sequence TagsAnalytical Chemistry, 1994
- An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein databaseJournal of the American Society for Mass Spectrometry, 1994