Implementation and Uses of Automated de Novo Peptide Sequencing by Tandem Mass Spectrometry
- 3 May 2001
- journal article
- research article
- Published by American Chemical Society (ACS) in Analytical Chemistry
- Vol. 73 (11) , 2594-2604
- https://doi.org/10.1021/ac001196o
Abstract
There are several computer programs that can match peptide tandem mass spectrometry data to their exactly corresponding database sequences, and in most protein identification projects, these programs are utilized in the early stages of data interpretation. However, situations frequently arise where tandem mass spectral data cannot be correlated with any database sequences. In these cases, the unmatched data could be due to peptides derived from novel proteins, allelic or species-derived variants of known proteins, or posttranslational or chemical modifications. Two additional problems are frequently encountered in high-throughput protein identification. First, it is difficult to quickly sift through large amounts of data to identify those spectra that, due to poor signal or contaminants, can be ignored. Second, it is important to find incorrect database matches (false positives). We have chosen to address these difficulties by performing automatic de novo sequencing using a computer program called Lutefisk. Sequence candidates obtained are used as input in a homology-based database search program called CIDentify to identify variants of known proteins. Comparison of database-derived sequences with de novo sequences allows for electronic validation of database matches even if the latter are not completely correct. Modifications to the original Lutefisk program have been implemented to handle data obtained from triple quadrupole, ion trap, and quadrupole/time-of-flight hybrid (Qtof) mass spectrometers. For example, the linearity of mass errors due to temperature-dependent expansion of the flight tube in a Qtof was exploited such that isobaric amino acids (glutamine/lysine and oxidized methionine/phenylalanine) can be differentiated without careful attention to mass calibration.Keywords
This publication has 22 references indexed in Scilit:
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- Role of Accurate Mass Measurement (±10 ppm) in Protein Identification Strategies Employing MS or MS/MS and Database SearchingAnalytical Chemistry, 1999
- Protein indentification using mass spectrometric informationElectrophoresis, 1998
- Error-Tolerant Identification of Peptides in Sequence Databases by Peptide Sequence TagsAnalytical Chemistry, 1994
- Peptide Mass Maps: A Highly Informative Approach to Protein IdentificationAnalytical Biochemistry, 1993
- Protein Identification by Mass Profile FingerprintingBiochemical and Biophysical Research Communications, 1993
- Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases.Proceedings of the National Academy of Sciences, 1993
- Use of mass spectrometric molecular weight information to identify proteins in sequence databasesJournal of Mass Spectrometry, 1993
- Computer-aided peptide sequencing by fast atom bombardment mass spectrometryJournal of Mass Spectrometry, 1986
- PAAS 3: A computer program to determine probable sequence of peptides from mass spectrometric dataJournal of Mass Spectrometry, 1984