Identifying HLA supertypes by learning distance functions

Open Access

15 January 2007

journal article
research article
Published by Oxford University Press (OUP) in Bioinformatics

Vol. 23 (2) , e148-e155
https://doi.org/10.1093/bioinformatics/btl324

Abstract

Motivation: The development of epitope-based vaccines crucially relies on the ability to classify Human Leukocyte Antigen (HLA) molecules into sets that have similar peptide binding specificities, termed supertypes. In their seminal work, Sette and Sidney defined nine HLA class I supertypes and claimed that these provide an almost perfect coverage of the entire repertoire of HLA class I molecules. HLA alleles are highly polymorphic and polygenic and therefore experimentally classifying each of these molecules to supertypes is at present an impossible task. Recently, a number of computational methods have been proposed for this task. These methods are based on defining protein similarity measures, derived from analysis of binding peptides or from analysis of the proteins themselves. Results: In this paper we define both peptide derived and protein derived similarity measures, which are based on learning distance functions. The peptide derived measure is defined using a peptide–peptide distance function, which is learned using information about known binding and non-binding peptides. The protein derived similarity measure is defined using a protein–protein distance function, which is learned using information about alleles previously classified to supertypes by Sette and Sidney (1999). We compare the classification obtained by these two complimentary methods to previously suggested classification methods. In general, our results are in excellent agreement with the classifications suggested by Sette and Sidney (1999) and with those reported by Buus et al. (2004). The main important advantage of our proposed distance-based approach is that it makes use of two different and important immunological sources of information—HLA alleles and peptides that are known to bind or not bind to these alleles. Since each of our distance measures is trained using a different source of information, their combination can provide a more confident classification of alleles to supertypes. Contact:tomboy@cs.huji.ac.il; cheny@cs.huji.ac.il

Keywords

This publication has 21 references indexed in Scilit:

PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance Functions
BMC Bioinformatics, 2006
Predicting Protein-Peptide Binding Affinity by Learning Peptide-Peptide Distance Functions
Published by Springer Nature ,2005
Definition of supertypes for HLA molecules using clustering of specificity matrices
Immunogenetics, 2004
Boosting margin based distance functions for clustering
Published by Association for Computing Machinery (ACM) ,2004
Sensitive quantitative predictions of peptide‐MHC binding by a ‘Query by Committee’ artificial neural network approach
Tissue Antigens, 2003
IMGT/HLA and IMGT/MHC: sequence databases for the study of the major histocompatibility complex
Nucleic Acids Research, 2003
New quantitative descriptors of amino acids based on multidimensional scaling of a large number of physical?chemical properties
Journal of Molecular Modeling, 2001
IMMUNODOMINANCE IN MAJOR HISTOCOMPATIBILITY COMPLEX CLASS I–RESTRICTED T LYMPHOCYTE RESPONSES
Annual Review of Immunology, 1999
MHCPEP, a database of MHC-binding peptides: update 1997
Nucleic Acids Research, 1998
Definition of an HLA-A3-like supermotif demonstrates the overlapping peptide-binding repertoires of common HLA molecules
Human Immunology, 1996