Identification of Bacteria Using Tandem Mass Spectrometry Combined with a Proteome Database and Statistical Scoring
- 13 March 2004
- journal article
- research article
- Published by American Chemical Society (ACS) in Analytical Chemistry
- Vol. 76 (8) , 2355-2366
- https://doi.org/10.1021/ac0349781
Abstract
Detection and identification of pathogenic bacteria and their protein toxins play a crucial role in a proper response to natural or terrorist-caused outbreaks of infectious diseases. The recent availability of whole genome sequences of priority bacterial pathogens opens new diagnostic possibilities for identification of bacteria by retrieving their genomic or proteomic information. We describe a method for identification of bacteria based on tandem mass spectrometric (MS/MS) analysis of peptides derived from bacterial proteins. This method involves bacterial cell protein extraction, trypsin digestion, liquid chromatography MS/MS analysis of the resulting peptides, and a statistical scoring algorithm to rank MS/MS spectral matching results for bacterial identification. To facilitate spectral data searching, a proteome database was constructed by translating genomes of bacteria of interest with fully or partially determined sequences. In this work, a prototype database was constructed by the automated analysis of 87 publicly available, fully sequenced bacterial genomes with the GLIMMER gene finding software. MS/MS peptide spectral matching for peptide sequence assignment against this proteome database was done by SEQUEST. To gauge the relative significance of the SEQUEST-generated matching parameters for correct peptide assignment, discriminant function (DF) analysis of these parameters was applied and DF scores were used to calculate probabilities of correct MS/MS spectra assignment to peptide sequences in the database. The peptides with DF scores exceeding a threshold value determined by the probability of correct peptide assignment were accepted and matched to the bacterial proteomes represented in the database. Sequence filtering or removal of degenerate peptides matched with multiple bacteria was then performed to further improve identification. It is demonstrated that using a preset criterion with known distributions of discriminant function scores and probabilities of correct peptide sequence assignments, a test bacterium within the 87 database microorganisms can be unambiguously identified.Keywords
This publication has 21 references indexed in Scilit:
- On the Nature of Gene Innovation: Duplication Patterns in Microbial GenomesMolecular Biology and Evolution, 2003
- Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database SearchAnalytical Chemistry, 2002
- Qscore: An algorithm for evaluating SEQUEST database search resultsJournal of the American Society for Mass Spectrometry, 2002
- Analysis of Proteins and Proteomes by Mass SpectrometryAnnual Review of Biochemistry, 2001
- Characterization of intact microorganisms by MALDI mass spectrometryMass Spectrometry Reviews, 2001
- Genomics and Bacterial PathogenesisEmerging Infectious Diseases, 2000
- MASS SPECTRAL INVESTIGATIONS ON MICROORGANISMSJournal of Toxicology: Toxin Reviews, 2000
- Observation ofEscherichia coliRibosomal Proteins and Their Posttranslational Modifications by Mass SpectrometryAnalytical Biochemistry, 1999
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997