SNPs, haplotypes, and model selection in a candidate gene region: The SIMPle analysis for multilocus data
- 12 November 2004
- journal article
- Published by Wiley in Genetic Epidemiology
- Vol. 27 (4) , 429-441
- https://doi.org/10.1002/gepi.20039
Abstract
Modern molecular techniques make discovery of numerous single nucleotide polymorphims (SNPs) in candidate gene regions feasible. Conventional analysis relies on either independent tests with each variant or the use of haplotypes in association analysis. The first technique ignores the dependencies between SNPs. The second, though it may increase power, often introduces uncertainty by estimating haplotypes from population data. Additionally, as the number of loci expands for a haplotype, ambiguity in interpretation increases for determining the underlying genetic components driving a detected association. Here, we present a genotype‐level analysis to jointly model the SNPs via a SNP interaction model with phase information (SIMPle) to capture the underlying haplotype structure. This analysis estimates both the risk associated with each variant and the importance of phase between pairwise combinations of SNPs. Thus, rather than selecting between genotype‐ or haplotype‐level approaches, the SIMPle method frames the analysis of multilocus data in a model selection paradigm, the aim to determine which SNPs, phase terms, and linear combinations best describe the relation between genetic variation and a trait of interest. To avoid unstable estimation due to sparse data and to incorporate both the dependencies among terms and the uncertainty in model selection, we propose a Bayes model averaging procedure. This highlights key SNPs and phase terms and yields a set of best representative models. Using simulations, we demonstrate the utility of the SIMPle model to identify crucial SNPs and underlying haplotype structures across a variety of causal models and genetic architectures. Genet. Epidemiol.Keywords
This publication has 27 references indexed in Scilit:
- The International HapMap ProjectNature, 2003
- Linkage disequilibrium assessment via log‐linear modeling of SNP haplotype frequenciesGenetic Epidemiology, 2003
- Analysis of multilocus models of associationGenetic Epidemiology, 2003
- Hierarchical Modeling of Linkage Disequilibrum: Genetic Structure and Spatial RelationsAmerican Journal of Human Genetics, 2003
- On the advantage of haplotype analysis in the presence of multiple disease susceptibility allelesGenetic Epidemiology, 2002
- Score Tests for Association between Traits and Haplotypes when Linkage Phase Is AmbiguousAmerican Journal of Human Genetics, 2002
- Calibration and empirical Bayes variable selectionBiometrika, 2000
- Bayesian variable selection with related predictorsThe Canadian Journal of Statistics / La Revue Canadienne de Statistique, 1996
- Bayes FactorsJournal of the American Statistical Association, 1995
- Variable Selection via Gibbs SamplingJournal of the American Statistical Association, 1993