Two‐Sample Tests for Comparing Intra‐Individual Genetic Sequence Diversity between Populations
- 28 February 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Biometrics
- Vol. 61 (1) , 106-117
- https://doi.org/10.1111/j.0006-341x.2005.020719.x
Abstract
Consider a study of two groups of individuals infected with a population of a genetically related heterogeneous mixture of viruses, and multiple viral sequences are sampled from each person. Based on estimates of genetic distances between pairs of aligned viral sequences within individuals, we develop four new tests to compare intra-individual genetic sequence diversity between the two groups. This problem is complicated by two levels of dependency in the data structure: (i) Within an individual, any pairwise distances that share a common sequence are positively correlated; and (ii) for any two pairings of individuals which share a person, the two differences in intra-individual distances between the paired individuals are positively correlated. The first proposed test is based on the difference in mean intra-individual pairwise distances pooled over all individuals in each group, standardized by a variance estimate that corrects for the correlation structure using U-statistic theory. The second procedure is a nonparametric rank-based analog of the first test, and the third test contrasts the set of subject-specific average intra-individual pairwise distances between the groups. These tests are very easy to use and solve correlation problem (i). The fourth procedure is based on a linear combination of all possible U-statistics calculated on independent, identically distributed sequence subdatasets, over the two levels (i) and (ii) of dependencies in the data, and is more complicated than the other tests but can be more powerful. Although the proposed methods are empirical and do not fully utilize knowledge from population genetics, the tests reflect biology through the evolutionary models used to derive the pairwise sequence distances. The new tests are evaluated theoretically and in a simulation study, and are applied to a dataset of 200 HIV sequences sampled from 21 children.Keywords
This publication has 17 references indexed in Scilit:
- Molecular and morphological data reveal cryptic taxonomic diversity in the terrestrial slug complex Arion subfuscus/fuscus (Mollusca, Pulmonata, Arionidae) in continental north-west EuropeBiological Journal of the Linnean Society, 2004
- What Constitutes Efficacy for a Human Immunodeficiency Virus Vaccine that Ameliorates Viremia: Issues Involving Surrogate End Points in Phase 3 TrialsThe Journal of Infectious Diseases, 2003
- Enhanced Detection of Human Immunodeficiency Virus Type 1-Specific T-Cell Responses to Highly Variable Regions by Using Peptides Based on Autologous Virus SequencesJournal of Virology, 2003
- Consensus and Ancestral State HIV VaccinesScience, 2003
- Clustering Patterns of Cytotoxic T-Lymphocyte Epitopes in Human Immunodeficiency Virus Type 1 (HIV-1) Proteins Reveal Imprints of Immune Evasion on HIV-1 Global VariationJournal of Virology, 2002
- Challenges and opportunities for development of an AIDS vaccineNature, 2001
- Prediction of well-conserved HIV-1 ligands using a matrix-based algorithm, EpiMatrixVaccine, 1998
- Estimating synonymous and nonsynonymous substitution ratesMolecular Biology and Evolution, 1996
- Combining dependent tests with incomplete repeated measurementsBiometrika, 1985
- TESTS OF HYPOTHESES CONCERNING LOCATION AND SCALE PARAMETERSBiometrika, 1939