Separating the Wheat from the Chaff: Unbiased Filtering of Background Tandem Mass Spectra Improves Protein Identification
- 18 June 2008
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Proteome Research
- Vol. 7 (8) , 3382-3395
- https://doi.org/10.1021/pr800140v
Abstract
Only a small fraction of spectra acquired in LC-MS/MS runs matches peptides from target proteins upon database searches. The remaining, operationally termed background, spectra originate from a variety of poorly controlled sources and affect the throughput and confidence of database searches. Here, we report an algorithm and its software implementation that rapidly removes background spectra, regardless of their precise origin. The method estimates the dissimilarity distance between screened MS/MS spectra and unannotated spectra from a partially redundant background library compiled from several control and blank runs. Filtering MS/MS queries enhanced the protein identification capacity when searches lacked spectrum to sequence matching specificity. In sequence-similarity searches it reduced by, on average, 30-fold the number of orphan hits, which were not explicitly related to background protein contaminants and required manual validation. Removing high quality background MS/MS spectra, while preserving in the data set the genuine spectra from target proteins, decreased the false positive rate of stringent database searches and improved the identification of low-abundance proteins.Keywords
This publication has 60 references indexed in Scilit:
- Tandem affinity purification of functional TAP-tagged proteins from human cellsNature Protocols, 2007
- Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometryNature Methods, 2007
- An integrated mass spectrometric and computational framework for the analysis of protein interaction networksNature Biotechnology, 2007
- In-gel digestion for mass spectrometric characterization of proteins and proteomesNature Protocols, 2006
- New Data Base-independent, Sequence Tag-based Scoring of Peptide MS/MS Data Validates Mowse Scores, Recovers Below Threshold Data, Singles Out Modified Peptides, and Assesses the Quality of MS/MS TechniquesMolecular & Cellular Proteomics, 2005
- Tryptic transpeptidation products observed in proteome analysis by liquid chromatography‐tandem mass spectrometryProteomics, 2005
- Proteomic characterization of the human centrosome by protein correlation profilingNature, 2003
- Mass spectrometry-based proteomicsNature, 2003
- Functional Assignment of the 20 S Proteasome from Trypanosoma brucei Using Mass Spectrometry and New Bioinformatics ApproachesJournal of Biological Chemistry, 2001
- Charting the Proteomes of Organisms with Unsequenced Genomes by MALDI-Quadrupole Time-of-Flight Mass Spectrometry and BLAST Homology SearchingAnalytical Chemistry, 2001