Predicting Class I Major Histocompatibility Complex (MHC) Binders Using Multivariate Statistics: Comparison of Discriminant Analysis and Multiple Linear Regression
- 1 January 2007
- journal article
- research article
- Published by American Chemical Society (ACS) in Journal of Chemical Information and Modeling
- Vol. 47 (1) , 234-238
- https://doi.org/10.1021/ci600318z
Abstract
The accurate in silico identification of T-cell epitopes is a critical step in the development of peptide-based vaccines, reagents, and diagnostics. It has a direct impact on the success of subsequent experimental work. Epitopes arise as a consequence of complex proteolytic processing within the cell. Prior to being recognized by T cells, an epitope is presented on the cell surface as a complex with a major histocompatibility complex (MHC) protein. A prerequisite therefore for T-cell recognition is that an epitope is also a good MHC binder. Thus, T-cell epitope prediction overlaps strongly with the prediction of MHC binding. In the present study, we compare discriminant analysis and multiple linear regression as algorithmic engines for the definition of quantitative matrices for binding affinity prediction. We apply these methods to peptides which bind the well-studied human MHC allele HLA-A*0201. A matrix which results from combining results of the two methods proved powerfully predictive under cross-validation. The new matrix was also tested on an external set of 160 binders to HLA-A*0201; it was able to recognize 135 (84%) of them.Keywords
This publication has 25 references indexed in Scilit:
- Computational vaccine designPublished by Royal Society of Chemistry (RSC) ,2007
- Enhancement to the RANKPEP resource for the prediction of peptide binding to MHC molecules using profilesImmunogenetics, 2004
- The HLA-A2-supermotif: a QSAR definitionElectronic supplementary information (ESI) available: matrices for A*6802, A*0206, A*0203, A*0202 and A*0201. See http://www.rsc.org/suppdata/ob/b3/b300707c/Organic & Biomolecular Chemistry, 2003
- Producing Nature’s Gene-Chips: The Generation of Peptides for Display by MHC Class I MoleculesAnnual Review of Immunology, 2002
- Neural network-based prediction of candidate T-cell epitopesNature Biotechnology, 1998
- The use of the area under the ROC curve in the evaluation of machine learning algorithmsPattern Recognition, 1997
- Ranking potential binding peptides to MHC molecules by a computational threading approachJournal of Molecular Biology, 1995
- Using a neural network to identify potential HLA‐DR1 binding sites within proteinsJournal of Molecular Recognition, 1993
- Prediction of major histocompatibility complex binding regions of protein antigens by sequence pattern analysis.Proceedings of the National Academy of Sciences, 1989
- T-cell antigenic sites tend to be amphipathic structures.Proceedings of the National Academy of Sciences, 1985