GutenTag: High-Throughput Sequence Tagging via an Empirically Derived Fragmentation Model
- 18 October 2003
- journal article
- research article
- Published by American Chemical Society (ACS) in Analytical Chemistry
- Vol. 75 (23) , 6415-6421
- https://doi.org/10.1021/ac0347462
Abstract
Shotgun proteomics is a powerful tool for identifying the protein content of complex mixtures via liquid chromatography and tandem mass spectrometry. The most widely used class of algorithms for analyzing mass spectra of peptides has been database search software such as SEQUEST. A new sequence tag database search algorithm, called GutenTag, makes it possible to identify peptides with unknown posttranslational modifications or sequence variations. This software automates the process of inferring partial sequence “tags” directly from the spectrum and efficiently examines a sequence database for peptides that match these tags. When multiple candidate sequences result from the database search, the software evaluates which is the best match by a rapid examination of spectral fragment ions. We compare GutenTag's accuracy to that of SEQUEST on a defined protein mixture, showing that both modified and unmodified peptides can be successfully identified by this approach. GutenTag analyzed 33 000 spectra from a human lens sample, identifying peptides that were missed in prior SEQUEST analysis due to sequence polymorphisms and posttranslational modifications. The software is available under license; visit http://fields.scripps.edu for information.Keywords
This publication has 29 references indexed in Scilit:
- Proteomic analysis of post-translational modificationsNature Biotechnology, 2003
- Mutation-Tolerant Protein Identification by Mass SpectrometryJournal of Computational Biology, 2000
- Automated interpretation of low-energy collision-induced dissociation spectra by SeqMS, a software aid forde novo sequencing by tandem mass spectrometryElectrophoresis, 2000
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- De NovoPeptide Sequencing via Tandem Mass SpectrometryJournal of Computational Biology, 1999
- Method to Correlate Tandem Mass Spectra of Modified Peptides to Amino Acid Sequences in the Protein DatabaseAnalytical Chemistry, 1995
- Error-Tolerant Identification of Peptides in Sequence Databases by Peptide Sequence TagsAnalytical Chemistry, 1994
- Calculation of isotope distributions in mass spectrometry. A trivial solution for a non-trivial problemAnalytica Chimica Acta, 1991
- Fast algorithm for peptide sequencing by mass spectroscopyJournal of Mass Spectrometry, 1990
- Complete cDNA sequence for rabbit muscle glycogen phosphorylaseFEBS Letters, 1986