On the viability of unsupervised T-cell receptor sequence clustering for epitope preference
- 21 September 2018
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 35 (9) , 1461-1468
- https://doi.org/10.1093/bioinformatics/bty821
Abstract
The T-cell receptor (TCR) is responsible for recognizing epitopes presented on cell surfaces. Linking TCR sequences to their ability to target specific epitopes is currently an unsolved problem, yet one of great interest. Indeed, it is currently unknown how dissimilar TCR sequences can be before they no longer bind the same epitope. This question is confounded by the fact that there are many ways to define the similarity between two TCR sequences. Here we investigate both issues in the context of TCR sequence unsupervised clustering. We provide an overview of the performance of various distance metrics on two large independent data sets with 412 and 2835 TCR sequences respectively. Our results confirm the presence of structural distinct TCR groups that target identical epitopes. In addition, we put forward several recommendations to perform unsupervised T-cell receptor sequence clustering. Source code implemented in Python 3 available at https://github.com/pmeysman/TCRclusteringPaper. Supplementary data are available at Bioinformatics online.All Related Versions
Funding Information
- BOF Concerted Research Action (PS ID 30730)
- IOF
- SBO
- Antwerp Study Centre for Infectious Diseases
- Research Foundation Flanders
- FWO (G067118N)
- NDN (1S29816N)
This publication has 26 references indexed in Scilit:
- T cell fate and clonality inference from single-cell transcriptomesNature Methods, 2016
- TCRβ repertoire of CD4+ and CD8+ T cells is distinct in richness, distribution, and CDR3 amino acid compositionJournal of Leukocyte Biology, 2015
- High-throughput pairing of T cell receptor α and β sequencesScience Translational Medicine, 2015
- Tracking global changes induced in the CD4 T-cell receptor repertoire by immunization with a complex antigen using short stretches of CDR3 protein sequenceBioinformatics, 2014
- Linking T-cell receptor sequence to functional phenotype at the single-cell levelNature Biotechnology, 2014
- Decombinator: a tool for fast, efficient gene assignment in T-cell receptor sequences using a finite state machineBioinformatics, 2013
- Overlap and Effective Size of the Human CD8 + T Cell Receptor RepertoireScience Translational Medicine, 2010
- Comprehensive assessment of T-cell receptor β-chain diversity in αβ T cellsBlood, 2009
- Confronting Complexity: Real-World Immunodominance in Antiviral CD8+ T Cell ResponsesImmunity, 2006
- IMGT gene identification and Colliers de Perles of human immunoglobulins with known 3D structuresImmunogenetics, 2002