Fast multiclonal clusterization of V(D)J recombinations from high-throughput sequencing
Open Access
- 28 May 2014
- journal article
- research article
- Published by Springer Nature in BMC Genomics
- Vol. 15 (1) , 1-12
- https://doi.org/10.1186/1471-2164-15-409
Abstract
V(D)J recombinations in lymphocytes are essential for immunological diversity. They are also useful markers of pathologies. In leukemia, they are used to quantify the minimal residual disease during patient follow-up. However, the full breadth of lymphocyte diversity is not fully understood. We propose new algorithms that process high-throughput sequencing (HTS) data to extract unnamed V(D)J junctions and gather them into clones for quantification. This analysis is based on a seed heuristic and is fast and scalable because in the first phase, no alignment is performed with germline database sequences. The algorithms were applied to TR γ HTS data from a patient with acute lymphoblastic leukemia, and also on data simulating hypermutations. Our methods identified the main clone, as well as additional clones that were not identified with standard protocols. The proposed algorithms provide new insight into the analysis of high-throughput sequencing data for leukemia, and also to the quantitative assessment of any immunological profile. The methods described here are implemented in a C++ open-source program called Vidjil.Keywords
This publication has 36 references indexed in Scilit:
- High-resolution antibody dynamics of vaccine-induced immune responsesProceedings of the National Academy of Sciences, 2014
- Teleost Fish Mount Complex Clonal IgM and IgT Responses in Spleen upon Systemic Viral InfectionPLoS Pathogens, 2013
- Rep‐Seq: uncovering the immunological repertoire through next‐generation sequencingImmunology, 2012
- Benchmarking the performance of human antibody gene alignment utilities using a 454 sequence datasetBioinformatics, 2010
- Individual Variation in the Germline Ig Gene Repertoire Inferred from Variable Region Gene RearrangementsThe Journal of Immunology, 2010
- SoDA2: a Hidden Markov Model approach for identification of immunoglobulin rearrangementsBioinformatics, 2010
- Profiling the T-cell receptor beta-chain repertoire by massively parallel sequencingGenome Research, 2009
- High-Throughput Sequencing of the Zebrafish Antibody RepertoireScience, 2009
- No evidence for the use of DIR, D–D fusions, chromosome 15 open reading frames or VHreplacement in the peripheral repertoire was found on application of an improved algorithm, JointML, to 6329 human immunoglobulin H rearrangementsImmunology, 2006
- IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genesNucleic Acids Research, 2004