New computational approaches for de novo peptide sequencing from MS/MS experiments

Abstract
We describe computational methods to solve the problem of identifying novel proteins from tandem mass spectrometry (tandem MS or MS/MS) data and introduce new approaches that will give more accurate solutions. These new approaches integrate chemical information and knowledge into a graph-theoretic framework. Two sources of chemical information that we investigate are mass tagging and dissociation chemistry in the tandem MS process itself. We describe machine learning techniques that are used to classify peaks according to ion types based on known dissociation chemistry. We describe the algorithms that are implemented in a software code called PepSUMS. Using PepSUMS, we give results on the effectiveness of the new methods on the ultimate goal of improved protein identification.