Improving feature detection and analysis of surface-enhanced laser desorption/ionization-time of flight mass spectra

Abstract
Discovering valid biological information from surface-enhanced laser desorption/ionization-time of flight mass spectrometry (SELDI-TOF MS) depends on clear experimental design, meticulous sample handling, and sophisticated data processing. Most published literature deals with the biological aspects of these experiments, or with computer-learning algorithms to locate sets of classifying biomarkers. The process of locating and measuring proteins across spectra has received less attention. This process should be tunable between sensitivity and false-discovery, and should guarantee that features are biologically meaningful in that they represent chemical species that can be identified and investigated. Existing feature detection in SELDI-TOF MS is not optimal for acquiring biologically relevant data. Most methods have so many user-defined settings that reproducibility and comparability among studies suffer considerably. To address these issues, we have developed an approach, called simultaneous spectrum analysis (SSA), which (i) locates proteins across spectra, (ii) measures their abundance, (iii) subtracts baseline, (iv) excludes irreproducible measurements, and (v) computes normalization factors for comparing spectra. SSA uses only two key parameters for feature detection and one parameter each for quality thresholds on spectra and peaks. The effectiveness of SSA is demonstrated by identifying proteins differentially expressed in SELDI-TOF spectra from plasma of wild-type and knockout mice for plasma glutathione peroxidase. Comparing analyses by SSA and CiphergenExpress Data Manager 2.1 finds similar results for large signal peaks, but SSA improves the number and quality of differences betweens groups among lower signal peaks. SSA is also less likely to introduce systematic bias when normalizing spectra.