The use of proteotypic peptide libraries for protein identification
- 8 June 2005
- journal article
- research article
- Published by Wiley in Rapid Communications in Mass Spectrometry
- Vol. 19 (13) , 1844-1850
- https://doi.org/10.1002/rcm.1992
Abstract
This paper describes an algorithm to apply proteotypic peptide sequence libraries to protein identifications performed using tandem mass spectrometry (MS/MS). Proteotypic peptides are those peptides in a protein sequence that are most likely to be confidently observed by current MS-based proteomics methods. Libraries of proteotypic peptide sequences were compiled from the Global Proteome Machine Database for Homo sapiens and Saccharomyces cerevisiae model species proteomes. These libraries were used to scan through collections of tandem mass spectra to discover which proteins were represented by the data sets, followed by detailed analysis of the spectra with the full protein sequences corresponding to the discovered proteotypic peptides. This algorithm (Proteotypic Peptide Profiling, or P3) resulted in sequence-to-spectrum matches comparable to those obtained by conventional protein identification algorithms using only full protein sequences, with a 20-fold reduction in the time required to perform the identification calculations. The proteotypic peptide libraries, the open source code for the implementation of the search algorithm and a website for using the software have been made freely available. Approximately 4% of the residues in the H. sapiens proteome were required in the proteotypic peptide library to successfully identify proteins. Copyright © 2005 John Wiley & Sons, Ltd.Keywords
This publication has 30 references indexed in Scilit:
- Mass spectrometry of peptides and proteinsMethods, 2005
- Shotgun Proteomics ofMethanococcus jannaschiiand Insights into MethanogenesisJournal of Proteome Research, 2004
- What does it mean to identify a protein in proteomics?Trends in Biochemical Sciences, 2002
- Analysis of Proteins and Proteomes by Mass SpectrometryAnnual Review of Biochemistry, 2001
- Charting the Proteomes of Organisms with Unsequenced Genomes by MALDI-Quadrupole Time-of-Flight Mass Spectrometry and BLAST Homology SearchingAnalytical Chemistry, 2001
- Mass Spectrometry in ProteomicsChemical Reviews, 2001
- Probability-based protein identification by searching sequence databases using mass spectrometry dataElectrophoresis, 1999
- Protein indentification using mass spectrometric informationElectrophoresis, 1998
- Method to Correlate Tandem Mass Spectra of Modified Peptides to Amino Acid Sequences in the Protein DatabaseAnalytical Chemistry, 1995
- Error-Tolerant Identification of Peptides in Sequence Databases by Peptide Sequence TagsAnalytical Chemistry, 1994