The workings and failings of clustering T-cell receptor beta-chain sequences without a known epitope preference
Preprint
- 15 May 2018
- preprint
- Published by Cold Spring Harbor Laboratory in bioRxiv
- p. 318360
- https://doi.org/10.1101/318360
Abstract
The T-cell receptor is responsible for recognizing potentially harmful epitopes presented on cell surfaces. The binding rules that govern this recognition between receptor and epitope is currently an unsolved problem, yet one of great interest. Several methods have been proposed recently to perform supervised classification of T-cell receptor sequences, but this requires known examples of T-cell sequences for a given epitope. Here we study the viability of various methods to perform unsupervised clustering of distinct T-cell receptor sequences and how these clusters relate to their target epitope. The goal is to provide an overview of the performance of various distance metrics on two large independent T-cell receptor sequence data sets. Our results confirm the presence of structural distinct T-cell groups that target identical epitopes. In addition, we put forward several recommendations to perform T-cell receptor sequence clustering.Keywords
All Related Versions
- Published version: Bioinformatics, 35 (9), 1461.
This publication has 21 references indexed in Scilit:
- Memory CD4+ T cell receptor repertoire data mining as a tool for identifying cytomegalovirus serostatusGenes & Immunity, 2018
- Computational Strategies for Dissecting the High-Dimensional Complexity of Adaptive Immune RepertoiresFrontiers in Immunology, 2018
- T cell receptor repertoires of mice and humans are clustered in similarity networks around conserved public CDR3 sequenceseLife, 2017
- Identifying specificity groups in the T cell receptor repertoireNature, 2017
- Quantifiable predictive features define epitope-specific T cell receptor repertoiresNature, 2017
- Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoireNature Genetics, 2017
- RTCR: a pipeline for complete and accurate recovery of T cell repertoires from high throughput sequencing dataBioinformatics, 2016
- TCRβ repertoire of CD4+ and CD8+ T cells is distinct in richness, distribution, and CDR3 amino acid compositionJournal of Leukocyte Biology, 2015
- MiXCR: software for comprehensive adaptive immunity profilingNature Methods, 2015
- IMGT gene identification and Colliers de Perles of human immunoglobulins with known 3D structuresImmunogenetics, 2002