Peptide Sequence Tags for Fast Database Search in Mass-Spectrometry
- 15 June 2005
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Proteome Research
- Vol. 4 (4) , 1287-1295
- https://doi.org/10.1021/pr050011x
Abstract
Filtration techniques in the form of rapid elimination of candidate sequences while retaining the true one are key ingredients of database searches in genomics. Although SEQUEST and Mascot perform a conceptually similar task to the tool BLAST, the key algorithmic idea of BLAST (filtration) was never implemented in these tools. As a result MS/MS protein identification tools are becoming too time-consuming for many applications including search for post-translationally modified peptides. Moreover, matching millions of spectra against all known proteins will soon make these tools too slow in the same way that “genome vs genome” comparisons instantly made BLAST too slow. We describe the development of filters for MS/MS database searches that dramatically reduce the running time and effectively remove the bottlenecks in searching the huge space of protein modifications. Our approach, based on a probability model for determining the accuracy of sequence tags, achieves superior results compared to GutenTag, a popular tag generation algorithm. Our tag generating algorithm along with our de novo sequencing algorithm PepNovo can be accessed via the URL http://peptide.ucsd.edu/. Keywords: tags • tandem mass spectrometry • filtering • database search · PepNovoKeywords
This publication has 22 references indexed in Scilit:
- A computational method for assessing peptide‐ identification reliability in tandem mass spectrometry analysis with SEQUESTProteomics, 2004
- High-Throughput Identification of Proteins and Unanticipated Sequence Modifications Using a Mass-Based Alignment Algorithm for MS/MS de Novo Sequencing ResultsAnalytical Chemistry, 2004
- Intensity-based protein identification by machine learning from a library of tandem mass spectraNature Biotechnology, 2004
- Deriving statistical models for predicting peptide tandem MS product ion intensitiesBiochemical Society Transactions, 2003
- Popitam: Towards new heuristic strategies to improve protein identification from tandem mass spectrometry dataProteomics, 2003
- Proteomic analysis of post-translational modificationsNature Biotechnology, 2003
- Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database SearchAnalytical Chemistry, 2002
- Experimental Protein Mixture for Validating Tandem Mass Spectral AnalysisOMICS: A Journal of Integrative Biology, 2002
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- Error-Tolerant Identification of Peptides in Sequence Databases by Peptide Sequence TagsAnalytical Chemistry, 1994