MultiTag: Multiple Error-Tolerant Sequence Tag Search for the Sequence-Similarity Identification of Proteins by Mass Spectrometry
- 20 February 2003
- journal article
- research article
- Published by American Chemical Society (ACS) in Analytical Chemistry
- Vol. 75 (6) , 1307-1315
- https://doi.org/10.1021/ac026199a
Abstract
The characterization of proteomes by mass spectrometry is largely limited to organisms with sequenced genomes. To identify proteins from organisms with unsequenced genomes, database sequences from related species must be employed for sequence-similarity protein identifications. Peptide sequence tags (Mann, 1994) have been used successfully for the identification of proteins in sequence databases using partially interpreted tandem mass spectra of tryptic peptides. We have extended the ability of sequence tag searching to the identification of proteins whose sequences are yet unknown but are homologous to known database entries. The MultiTag method presented here assigns statistical significance to matches of multiple error-tolerant sequence tags to a database entry and ranks alignments by their significance. The MultiTag approach has the distinct advantage over other sequence-similarity approaches of being able to perform sequence-similarity identifications using only very short (2−4) amino acid residue stretches of peptide sequences, rather than complete peptide sequences deduced by de novo interpretation of tandem mass spectra. This feature facilitates the identification of low abundance proteins, since noisy and low-intensity tandem mass spectra can be utilized.Keywords
This publication has 18 references indexed in Scilit:
- Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database SearchAnalytical Chemistry, 2002
- Analysis of Proteins and Proteomes by Mass SpectrometryAnnual Review of Biochemistry, 2001
- Charting the Proteomes of Organisms with Unsequenced Genomes by MALDI-Quadrupole Time-of-Flight Mass Spectrometry and BLAST Homology SearchingAnalytical Chemistry, 2001
- Replication fork density increases during DNA synthesis in X. laevis egg extracts11Edited by M. YanivJournal of Molecular Biology, 2000
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Identification of components of trans‐Golgi network‐derived transport vesicles and detergent‐insoluble complexes by nanoelectrospray tandem mass spectrometryElectrophoresis, 1997
- A shortcut to interesting human genes: peptide sequence tags, expressed-sequence tags and computersTrends in Biochemical Sciences, 1996
- Error-Tolerant Identification of Peptides in Sequence Databases by Peptide Sequence TagsAnalytical Chemistry, 1994
- Searching protein sequence libraries: Comparison of the sensitivity and selectivity of the Smith-Waterman and FASTA algorithmsGenomics, 1991