Role of Accurate Mass Measurement (±10 ppm) in Protein Identification Strategies Employing MS or MS/MS and Database Searching

Abstract
We describe the impact of advances in mass measurement accuracy, ±10 ppm (internally calibrated), on protein identification experiments. This capability was brought about by delayed extraction techniques used in conjunction with matrix-assisted laser desorption ionization (MALDI) on a reflectron time-of-flight (TOF) mass spectrometer. This work explores the advantage of using accurate mass measurement (and thus constraint on the possible elemental composition of components in a protein digest) in strategies for searching protein, gene, and EST databases that employ (a) mass values alone, (b) fragment-ion tagging derived from MS/MS spectra, and (c) de novo interpretation of MS/MS spectra. Significant improvement in the discriminating power of database searches has been found using only molecular weight values (i.e., measured mass) of >10 peptide masses. When MALDI-TOF instruments are able to achieve the ±0.5−5 ppm mass accuracy necessary to distinguish peptide elemental compositions, it is possible to match homologous proteins having >70% sequence identity to the protein being analyzed. The combination of a ±10 ppm measured parent mass of a single tryptic peptide and the near-complete amino acid (AA) composition information from immonium ions generated by MS/MS is capable of tagging a peptide in a database because only a few sequence permutations >11 AA's in length for an AA composition can ever be found in a proteome. De novo interpretation of peptide MS/MS spectra may be accomplished by altering our MS-Tag program to replace an entire database with calculation of only the sequence permutations possible from the accurate parent mass and immonium ion limited AA compositions. A hybrid strategy is employed using de novo MS/MS interpretation followed by text-based sequence similarity searching of a database.