Retroviral Integration Process in the Human Genome: Is It Really Non-Random? A New Statistical Approach
Open Access
- 8 August 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 4 (8) , e1000144
- https://doi.org/10.1371/journal.pcbi.1000144
Abstract
Retroviral vectors are widely used in gene therapy to introduce therapeutic genes into patients' cells, since, once delivered to the nucleus, the genes of interest are stably inserted (integrated) into the target cell genome. There is now compelling evidence that integration of retroviral vectors follows non-random patterns in mammalian genome, with a preference for active genes and regulatory regions. In particular, Moloney Leukemia Virus (MLV)–derived vectors show a tendency to integrate in the proximity of the transcription start site (TSS) of genes, occasionally resulting in the deregulation of gene expression and, where proto-oncogenes are targeted, in tumor initiation. This has drawn the attention of the scientific community to the molecular determinants of the retroviral integration process as well as to statistical methods to evaluate the genome-wide distribution of integration sites. In recent approaches, the observed distribution of MLV integration distances (IDs) from the TSS of the nearest gene is assumed to be non-random by empirical comparison with a random distribution generated by computational simulation procedures. To provide a statistical procedure to test the randomness of the retroviral insertion pattern, we propose a probability model (Beta distribution) based on IDs between two consecutive genes. We apply the procedure to a set of 595 unique MLV insertion sites retrieved from human hematopoietic stem/progenitor cells. The statistical goodness of fit test shows the suitability of this distribution to the observed data. Our statistical analysis confirms the preference of MLV-based vectors to integrate in promoter-proximal regions. Understanding how retroviral vectors (such as Moloney Leukemia Virus–based vectors) integrate in the human genome became a major safety issue in the field of gene therapy, since a concrete risk of developing tumors associated with the integration process was assessed in the clinical setting. Moloney Leukemia Virus–based vectors are apparently characterized by a non-random integration pattern, with a preference for the vicinities of active gene transcription start sites. We approach the problem of non-random retroviral integration from a probabilistic point of view. We model a normalized integration distance from the transcription start site of the nearest upstream or downstream gene. From this model, we derive a simple and straightforward testing procedure to estimate how the transcription start site of a given gene may or may not attract integration events. Our approach overcomes the issues of different gene length, gene orientation, and gene density, which are often critical in analyzing integration distances from transcription start sites. The approach is tested on real experimental data retrieved from human hematopoietic stem/progenitor cells.Keywords
This publication has 18 references indexed in Scilit:
- A Primer on Learning in Bayesian Networks for Computational BiologyPLoS Computational Biology, 2007
- Multilineage hematopoietic reconstitution without clonal selection in ADA-SCID patients treated with stem cell gene therapyJournal of Clinical Investigation, 2007
- Real-Time Definition of Non-Randomness in the Distribution of Genomic EventsPLOS ONE, 2007
- Cell-culture assays reveal the importance of retroviral vector design for insertional genotoxicityBlood, 2006
- Inference in Bayesian networksNature Biotechnology, 2006
- Genome-wide analysis of retroviral DNA integrationNature Reviews Microbiology, 2005
- Distinct Genomic Integration of MLV and SIV Vectors in Primate Hematopoietic Stem and Progenitor CellsPLoS Biology, 2004
- Activation of the T-Cell OncogeneLMO2after Gene Therapy for X-Linked Severe Combined ImmunodeficiencyNew England Journal of Medicine, 2004
- On the Bell-Shape of Stable DensitiesThe Annals of Probability, 1984
- ON ESTIMATING PARAMETERS FOR BETA DISTRIBUTIONSDecision Sciences, 1978