Protein Identification in Complex Mixtures
- 9 February 2005
- journal article
- Published by American Chemical Society (ACS) in Journal of Proteome Research
- Vol. 4 (2) , 387-393
- https://doi.org/10.1021/pr049816f
Abstract
This paper investigates the prospects of successful mass spectrometric protein identification based on mass data from proteolytic digests of complex protein mixtures. Sets of proteolytic peptide masses representing various numbers of digested proteins in a mixture were generated in silico. In each set, different proteins were selected from a protein sequence collection and for each protein the sequence coverage was randomly selected within a particular regime (15-30% or 30-60%). We demonstrate that the Probity algorithm, which is characterized by an optimal tolerance for random interference, employed in an iterative procedure can correctly identify >95% of proteins at a desired significance level in mixtures composed of hundreds of yeast proteins under realistic mass spectrometric experimental constraints. By using a model of the distribution of protein abundance, we demonstrate that the very high efficiency of identification of protein mixtures that can be achieved by appropriate choices of informatics procedures is hampered by limitations of the mass spectrometric dynamic range. The results stress the desire to choose carefully experimental protocols for comprehensive proteome analysis, focusing on truly critical issues such as the dynamic range, which potentially limits the possibilities of identifying low abundance proteins.Keywords
This publication has 28 references indexed in Scilit:
- Method for differential detection and identification of components in protein mixtures analyzed by matrix-assisted laser desorption/ionization time-of-flight mass spectrometryRapid Communications in Mass Spectrometry, 2004
- The Statistical Significance of Protein Identification Results as a Function of the Number of Protein Sequences SearchedJournal of Proteome Research, 2004
- From genomics to proteomicsNature, 2003
- Database searching with mass-spectrometric informationTrends in Biotechnology, 2000
- The Genome Sequence of Drosophila melanogasterScience, 2000
- Mass spectrometryTrends in Genetics, 2000
- Genome Sequence of the Nematode C. elegans : A Platform for Investigating BiologyScience, 1998
- Life with 6000 GenesScience, 1996
- Whole-Genome Random Sequencing and Assembly of Haemophilus influenzae RdScience, 1995
- Factors affecting the ultraviolet laser desorption of proteinsRapid Communications in Mass Spectrometry, 1989