Recovering probabilities for nucleotide trimming processes for T cell receptor TRA and TRG V-J junctions analyzed with IMGT tools
Open Access
- 2 October 2008
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 9 (1) , 408
- https://doi.org/10.1186/1471-2105-9-408
Abstract
Nucleotides are trimmed from the ends of variable (V), diversity (D) and joining (J) genes during immunoglobulin (IG) and T cell receptor (TR) rearrangements in B cells and T cells of the immune system. This trimming is followed by addition of nucleotides at random, forming the N regions (N for nucleotides) of the V-J and V-D-J junctions. These processes are crucial for creating diversity in the immune response since the number of trimmed nucleotides and the number of added nucleotides vary in each B or T cell. IMGT® sequence analysis tools, IMGT/V-QUEST and IMGT/JunctionAnalysis, are able to provide detailed and accurate analysis of the final observed junction nucleotide sequences (tool "output"). However, as trimmed nucleotides can potentially be replaced by identical N region nucleotides during the process, the observed "output" represents a biased estimate of the "true trimming process." A probabilistic approach based on an analysis of the standardized tool "output" is proposed to infer the probability distribution of the "true trimmming process" and to provide plausible biological hypotheses explaining this process. We collated a benchmark dataset of TR alpha (TRA) and TR gamma (TRG) V-J rearranged sequences and junctions analysed with IMGT/V-QUEST and IMGT/JunctionAnalysis, the nucleotide sequence analysis tools from IMGT®, the international ImMunoGeneTics information system®, http://imgt.cines.fr . The standardized description of the tool output is based on the IMGT-ONTOLOGY axioms and concepts. We propose a simple first-order model that attempts to transform the observed "output" probability distribution into an estimate closer to the "true trimming process" probability distribution. We use this estimate to test the hypothesis that Poisson processes are involved in trimming. This hypothesis was not rejected at standard confidence levels for three of the four trimming processes: TRAV, TRAJ and TRGV. By using trimming of rearranged TR genes as a benchmark, we show that a probabilistic approach, applied to IMGT® standardized tool "outputs" opens the way to plausible hypotheses on the events involved in the "true trimming process" and eventually to an exact quantification of trimming itself. With increasing high-throughput of standardized immunogenetics data, similar probabilistic approaches will improve understanding of processes so far only characterized by the "output" of standardized tools.Keywords
This publication has 35 references indexed in Scilit:
- IMGT/V-QUEST: the highly customized and integrated system for IG and TR standardized V-J and V-D-J sequence analysisNucleic Acids Research, 2008
- IMGT-Kaleidoscope, the formal IMGT-ONTOLOGY paradigmBiochimie, 2008
- Evidence for Ku70/Ku80 association with full-length RAG1Nucleic Acids Research, 2008
- Extent to which hairpin opening by the Artemis:DNA-PKcs complex can contribute to junctional diversity in V(D)J recombinationNucleic Acids Research, 2007
- IMGT/JunctionAnalysis: the first tool for the analysis of the immunoglobulin and T cell receptor complex V–J and V–D–J JUNCTIONsBioinformatics, 2004
- V(D)J Recombination and the Evolution of the Adaptive Immune SystemPLoS Biology, 2003
- Mathematical StatisticsJournal of the American Statistical Association, 2000
- Structure of extrachromosomal circular DNAs excised from T-cell antigen receptor alpha and delta-chain lociJournal of Molecular Biology, 1988
- Somatic generation of antibody diversityNature, 1983
- An immunoglobulin heavy chain variable region gene is generated from three segments of DNA: VH, D and JHCell, 1980