An Exact Nonparametric Method for Inferring Mosaic Structure in Sequence Triplets
Top Cited Papers
- 1 June 2007
- journal article
- Published by Oxford University Press (OUP) in Genetics
- Vol. 176 (2) , 1035-1047
- https://doi.org/10.1534/genetics.106.068874
Abstract
Statistical tests for detecting mosaic structure or recombination among nucleotide sequences usually rely on identifying a pattern or a signal that would be unlikely to appear under clonal reproduction. Dozens of such tests have been described, but many are hampered by long running times, confounding of selection and recombination, and/or inability to isolate the mosaic-producing event. We introduce a test that is exact, nonparametric, rapidly computable, free of the infinite-sites assumption, able to distinguish between recombination and variation in mutation/fixation rates, and able to identify the breakpoints and sequences involved in the mosaic-producing event. Our test considers three sequences at a time: two parent sequences that may have recombined, with one or two breakpoints, to form the third sequence (the child sequence). Excess similarity of the child sequence to a candidate recombinant of the parents is a sign of recombination; we take the maximum value of this excess similarity as our test statistic Δm,n,b. We present a method for rapidly calculating the distribution of Δm,n,b and demonstrate that it has comparable power to and a much improved running time over previous methods, especially in detecting recombination in large data sets.Keywords
This publication has 71 references indexed in Scilit:
- Recombination Estimation Under Complex Evolutionary Models with the Coalescent Composite-Likelihood MethodMolecular Biology and Evolution, 2006
- Stochastic Processes Are Key Determinants of Short-Term Evolution in Influenza A VirusPLoS Pathogens, 2006
- Whole-Genome Analysis of Human Influenza A Virus Reveals Multiple Persistent Lineages and Reassortment among Recent H3N2 VirusesPLoS Biology, 2005
- Patterns of linkage disequilibrium in the human genomeNature Reviews Genetics, 2002
- Among-site rate variation and its impact on phylogenetic analysesTrends in Ecology & Evolution, 1996
- A program for calculating and displaying compatibility matrices as an aid in determining reticulate evolution in molecular sequencesBioinformatics, 1996
- How clonal are bacteria?Proceedings of the National Academy of Sciences, 1993
- Limit distributions of maximal segmental score among Markov-dependent partial sumsAdvances in Applied Probability, 1992
- An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolutionBiochemical Genetics, 1970
- Some Aspects of the Random SequenceThe Annals of Mathematical Statistics, 1965