Database searching using mass spectrometry data
- 1 May 1998
- journal article
- review article
- Published by Wiley in Electrophoresis
- Vol. 19 (6) , 893-900
- https://doi.org/10.1002/elps.1150190604
Abstract
Large‐scale DNA sequencing is creating a sequence infrastructure of great benefit to protein biochemistry. Concurrent with the application of large‐scale DNA sequencing to whole genome analysis, mass spectrometry has attained the capability to rapidly, and with remarkable sensitivity, determine weights and amino acid sequences of peptides. Computer algorithms have been developed to use the two different types of data generated by mass spectrometers to search sequence databases. When a protein is digested with a site‐specific protease, the molecular weights of the resulting collection of peptides, the mass map or fingerprint, can be determined using mass spectrometry. The molecular weights of the set of peptides derived from the digestion of a protein can then be used to identify the protein. Several different approaches have been developed. Protein identification using peptide mass mapping is an effective technique when studying organisms with completed genomes. A second method is based on the use of data created by tandem mass spectrometers. Tandem mass spectra contain highly specific information in the fragmentation pattern as well as sequence information. This information has been used to search databases of translated protein sequences as well as nucleotide databases such as expressed sequence tag (EST) sequences. The ability to search nucleotide databases is an advantage when analyzing data obtained from organisms whose genomes are not yet completed, but a large amount of expressed gene sequence is available (e.g., human and mouse). Furthermore, a strength of using tandem mass spectra to search databases is the ability to identify proteins present in fairly complex mixtures.Keywords
This publication has 43 references indexed in Scilit:
- IKK-1 and IKK-2: Cytokine-Activated IκB Kinases Essential for NF-κB ActivationScience, 1997
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- High Sensitivity Collisionally-activated Decomposition Tandem Mass Spectrometry on a Novel Quadrupole/Orthogonal-acceleration Time-of-flight Mass SpectrometerRapid Communications in Mass Spectrometry, 1996
- Sequence Analysis of the Genome of the Unicellular Cyanobacterium Synechocystis sp. Strain PCC6803. II. Sequence Determination of the Entire Genome and Assignment of Potential Protein-coding RegionsDNA Research, 1996
- An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein databaseJournal of the American Society for Mass Spectrometry, 1994
- Protein identification in DNA databases by peptide mass fingerprintingProtein Science, 1994
- Matrix‐assisted laser desorption of peptides and proteins on a quadrupole ion trap mass spectrometerRapid Communications in Mass Spectrometry, 1993
- Correction of the cDNA-derived protein sequence of prostatic spermine binding protein: pivotal role of tandem mass spectrometry in sequence analysisBiochemistry, 1988
- FAB-MAPPING of recombinant-DNA protein productsBiochemical and Biophysical Research Communications, 1983
- Cleavage of Structural Proteins during the Assembly of the Head of Bacteriophage T4Nature, 1970