Identifying HLA supertypes by learning distance functions
Open Access
- 15 January 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 23 (2) , e148-e155
- https://doi.org/10.1093/bioinformatics/btl324
Abstract
Motivation: The development of epitope-based vaccines crucially relies on the ability to classify Human Leukocyte Antigen (HLA) molecules into sets that have similar peptide binding specificities, termed supertypes. In their seminal work, Sette and Sidney defined nine HLA class I supertypes and claimed that these provide an almost perfect coverage of the entire repertoire of HLA class I molecules. HLA alleles are highly polymorphic and polygenic and therefore experimentally classifying each of these molecules to supertypes is at present an impossible task. Recently, a number of computational methods have been proposed for this task. These methods are based on defining protein similarity measures, derived from analysis of binding peptides or from analysis of the proteins themselves. Results: In this paper we define both peptide derived and protein derived similarity measures, which are based on learning distance functions. The peptide derived measure is defined using a peptide–peptide distance function, which is learned using information about known binding and non-binding peptides. The protein derived similarity measure is defined using a protein–protein distance function, which is learned using information about alleles previously classified to supertypes by Sette and Sidney (1999). We compare the classification obtained by these two complimentary methods to previously suggested classification methods. In general, our results are in excellent agreement with the classifications suggested by Sette and Sidney (1999) and with those reported by Buus et al. (2004). The main important advantage of our proposed distance-based approach is that it makes use of two different and important immunological sources of information—HLA alleles and peptides that are known to bind or not bind to these alleles. Since each of our distance measures is trained using a different source of information, their combination can provide a more confident classification of alleles to supertypes. Contact:tomboy@cs.huji.ac.il; cheny@cs.huji.ac.ilKeywords
This publication has 21 references indexed in Scilit:
- PepDist: A New Framework for Protein-Peptide Binding Prediction based on Learning Peptide Distance FunctionsBMC Bioinformatics, 2006
- Predicting Protein-Peptide Binding Affinity by Learning Peptide-Peptide Distance FunctionsPublished by Springer Nature ,2005
- Definition of supertypes for HLA molecules using clustering of specificity matricesImmunogenetics, 2004
- Boosting margin based distance functions for clusteringPublished by Association for Computing Machinery (ACM) ,2004
- Sensitive quantitative predictions of peptide‐MHC binding by a ‘Query by Committee’ artificial neural network approachTissue Antigens, 2003
- IMGT/HLA and IMGT/MHC: sequence databases for the study of the major histocompatibility complexNucleic Acids Research, 2003
- New quantitative descriptors of amino acids based on multidimensional scaling of a large number of physical?chemical propertiesJournal of Molecular Modeling, 2001
- IMMUNODOMINANCE IN MAJOR HISTOCOMPATIBILITY COMPLEX CLASS I–RESTRICTED T LYMPHOCYTE RESPONSESAnnual Review of Immunology, 1999
- MHCPEP, a database of MHC-binding peptides: update 1997Nucleic Acids Research, 1998
- Definition of an HLA-A3-like supermotif demonstrates the overlapping peptide-binding repertoires of common HLA moleculesHuman Immunology, 1996