Mining Genomes: Correlating Tandem Mass Spectra of Modified and Unmodified Peptides to Sequences in Nucleotide Databases

Abstract
The correlation of uninterpreted tandem mass spectra of modified and unmodified peptides, produced under low-energy (10-50 eV) collision conditions, with nucleotide sequences is demonstrated. In this method nucleotide databases are translated in six reading frames, and the resulting amino acid sequences are searched "on the fly" to identify and fit linear sequences to the fragmentation patterns observed in the tandem mass spectra of peptides. A cross-correlation function is then used to provide a measurement of similarity between the mass-to-charge ratios for the fragment ions predicted by amino acid sequences translated from the nucleotide database and the fragment ions observed in the tandem mass spectrum. In general, a difference greater than 0.1 between the normalized cross-correlation functions for the first- and second-ranked search results indicates a successful match between sequence and spectrum. Measurements of the deviation from maximum similarity employing the spectral reconstruction method are made. The search method employing nucleotide databases is also demonstrated on the spectra of phosphorylated peptides. Specific sites of modification are identified even though no specific information relevant to sites of modification is contained in the character-based sequence information of nucleotide databases.

This publication has 0 references indexed in Scilit: