Large-scale database searching using tandem mass spectra: Looking up the answer in the back of the book

Abstract
Database searching is an essential element of large-scale proteomics. Because these methods are widely used, it is important to understand the rationale of the algorithms. Most algorithms are based on concepts first developed in SEQUEST and PeptideSearch. Four basic approaches are used to determine a match between a spectrum and sequence: descriptive, interpretative, stochastic and probability–based matching. We review the basic concepts used by most search algorithms, the computational modeling of peptide identification and current challenges and limitations of this approach for protein identification.