Use of Artificial Neural Networks for the Accurate Prediction of Peptide Liquid Chromatography Elution Times in Proteome Analyses

Abstract
The use of artificial neural networks (ANNs) is described for predicting the reversed-phase liquid chromatography retention times of peptides enzymatically digested from proteome-wide proteins. To enable the accurate comparison of the numerous LC/MS data sets, a genetic algorithm was developed to normalize the peptide retention data into a range (from 0 to 1), improving the peptide elution time reproducibility to similar to1%. The network developed in this study was based on amino acid residue composition and consists of 20 input nodes, 2 hidden nodes, and 1 output node. A data set of similar to7000 confidendy identified peptides from the microorganism Deinococcus radiodurans was used for the training of the ANN. The ANN was then used to predict the elution times for another set of 5200 peptides tentatively identified by MS/MS from a different microorganism (Shewanella oneidensis). The model was found to predict the elution times of peptides with up to 54 amino acid residues (the longest peptide identified after tryptic digestion of S. oneidensis) with an average accuracy of similar to3%. This predictive capability was then used to distinguish with high confidence isobar peptides otherwise indistinguishable by accurate mass measurements as well as to uncover peptide misidentifications. Thus, integration of ANN peptide elution time prediction in the proteomic research will increase both the number of protein identifications and their confidence.

This publication has 37 references indexed in Scilit: