Phylogenetic Detection of Recombination with a Bayesian Prior on the Distance between Trees
Open Access
- 9 July 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 3 (7) , e2651
- https://doi.org/10.1371/journal.pone.0002651
Abstract
Genomic regions participating in recombination events may support distinct topologies, and phylogenetic analyses should incorporate this heterogeneity. Existing phylogenetic methods for recombination detection are challenged by the enormous number of possible topologies, even for a moderate number of taxa. If, however, the detection analysis is conducted independently between each putative recombinant sequence and a set of reference parentals, potential recombinations between the recombinants are neglected. In this context, a recombination hotspot can be inferred in phylogenetic analyses if we observe several consecutive breakpoints. We developed a distance measure between unrooted topologies that closely resembles the number of recombinations. By introducing a prior distribution on these recombination distances, a Bayesian hierarchical model was devised to detect phylogenetic inconsistencies occurring due to recombinations. This model relaxes the assumption of known parental sequences, still common in HIV analysis, allowing the entire dataset to be analyzed at once. On simulated datasets with up to 16 taxa, our method correctly detected recombination breakpoints and the number of recombination events for each breakpoint. The procedure is robust to rate and transition∶transversion heterogeneities for simulations with and without recombination. This recombination distance is related to recombination hotspots. Applying this procedure to a genomic HIV-1 dataset, we found evidence for hotspots and de novo recombination.Keywords
This publication has 68 references indexed in Scilit:
- Avoidance of Protein Fold Disruption in Natural Virus RecombinantsPLoS Pathogens, 2007
- Phylogenetic Mapping of Recombination Hotspots in Human Immunodeficiency Virus via Spatially Smoothed Change-Point ProcessesGenetics, 2007
- Relaxed Phylogenetics and Dating with ConfidencePLoS Biology, 2006
- The Cobweb of Life Revealed by Genome-Scale Estimates of Horizontal Gene TransferPLoS Biology, 2005
- Reconstructing Reticulate Evolution in Species—Theory and PracticeJournal of Computational Biology, 2005
- The analysis of near full-length genome sequences of human immunodeficiency virus type 1 BF intersubtype recombinant viruses from Chile, Venezuela and Spain reveals their relationship to diverse lineages of recombinant viruses related to CRF12_BFInfection, Genetics and Evolution, 2005
- Neighbor-Net: An Agglomerative Method for the Construction of Phylogenetic NetworksMolecular Biology and Evolution, 2003
- Kaikoura tree theorems: Computing the maximum agreement subtreeInformation Processing Letters, 1993
- Inference from Iterative Simulation Using Multiple SequencesStatistical Science, 1992
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981