De Novo Peptide Identification via Tandem Mass Spectrometry and Integer Linear Optimization
- 12 January 2007
- journal article
- research article
- Published by American Chemical Society (ACS) in Analytical Chemistry
- Vol. 79 (4) , 1433-1446
- https://doi.org/10.1021/ac0618425
Abstract
A novel methodology for the automated de novo identification of peptides via integer linear optimization (also referred to as integer linear programming or ILP) and tandem mass spectrometry is presented in this article. The various features of the mathematical model are presented and examples are used to illustrate the key concepts of the proposed approach. A variety of challenging peptide identification problems, accompanied by a comparative study with five state-of-the-art methods, are examined to illustrate the proposed method's ability to address (a) residue-dependent fragmentation properties that result in missing ion peaks and (b) the variability of resolution in different mass analyzers. A preprocessing algorithm is utilized to identify important m/z values in the tandem mass spectrum. Missing peaks, due to residue-dependent fragmentation characteristics, are dealt with using a two-stage algorithmic framework. A cross-correlation approach is used to resolve missing amino acid assignments and to select the most probable peptide by comparing the theoretical spectra of the candidate sequences that were generated from the ILP sequencing stages with the experimental tandem mass spectrum. The novel, proposed de novo method, denoted as PILOT, is compared to existing popular methods such as Lutefisk, PEAKS, PepNovo, EigenMS, and NovoHMM for a set of spectra resulting from QTOF and ion trap instruments.Keywords
This publication has 49 references indexed in Scilit:
- NovoHMM: A Hidden Markov Model for de Novo Peptide SequencingAnalytical Chemistry, 2005
- Analysis, statistical validation and dissemination of large-scale proteomics datasets generated by tandem MSDrug Discovery Today, 2004
- Improved peptide sequencing using isotope information inherent in tandem mass spectraRapid Communications in Mass Spectrometry, 2003
- A Suboptimal Algorithm for De Novo Peptide Sequencing via Tandem Mass SpectrometryJournal of Computational Biology, 2003
- New computational approaches for de novo peptide sequencing from MS/MS experimentsProceedings of the IEEE, 2002
- Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database SearchAnalytical Chemistry, 2002
- Automated interpretation of low-energy collision-induced dissociation spectra by SeqMS, a software aid forde novo sequencing by tandem mass spectrometryElectrophoresis, 2000
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein databaseJournal of the American Society for Mass Spectrometry, 1994
- Tandem mass spectrometry of peptides using hybrid and four-sector instruments: A comparative studyAnalytical Chemistry, 1991