Analysis of single‐locus tests to detect gene/disease associations

9 March 2005

journal article
research article
Published by Wiley in Genetic Epidemiology

Vol. 28 (3) , 207-219
https://doi.org/10.1002/gepi.20050

Abstract

A goal of association analysis is to determine whether variation in a particular candidate region or gene is associated with liability to complex disease. To evaluate such candidates, ubiquitous Single Nucleotide Polymorphisms (SNPs) are useful. It is critical, however, to select a set of SNPs that are in substantial linkage disequilibrium (LD) with all other polymorphisms in the region. Whether there is an ideal statistical framework to test such a set of ‘tag SNPs’ for association is unknown. Compared to tests for association based on frequencies of haplotypes, recent evidence suggests tests for association based on linear combinations of the tag SNPs (Hotelling T² test) are more powerful. Following this logical progression, we wondered if single‐locus tests would prove generally more powerful than the regression‐based tests? We answer this question by investigating four inferential procedures: the maximum of a series of test statistics corrected for multiple testing by the Bonferroni procedure, T_B, or by permutation of case‐control status, T_P; a procedure that tests the maximum of a smoothed curve fitted to the series of of test statistics, T_S; and the Hotelling T² procedure, which we call T_R. These procedures are evaluated by simulating data like that from human populations, including realistic levels of LD and realistic effects of alleles conferring liability to disease. We find that power depends on the correlation structure of SNPs within a gene, the density of tag SNPs, and the placement of the liability allele. The clearest pattern emerges between power and the number of SNPs selected. When a large fraction of the SNPs within a gene are tested, and multiple SNPs are highly correlated with the liability allele, T_S has better power. Using a SNP selection scheme that optimizes power but also requires a substantial number of SNPs to be genotyped (roughly 10–20 SNPs per gene), power of T_P is generally superior to that for the other procedures, including T_R. Finally, when a SNP selection procedure that targets a minimal number of SNPs per gene is applied, the average performances of T_P and T_R are indistinguishable. Genet. Epidemiol.

Keywords

This publication has 30 references indexed in Scilit:

Haplotype Diversity across 100 Candidate Genes for Inflammation, Lipid Metabolism, and Blood Pressure Regulation in Two Populations
American Journal of Human Genetics, 2004
Selecting a Maximally Informative Set of Single-Nucleotide Polymorphisms for Association Analyses Using Linkage Disequilibrium
American Journal of Human Genetics, 2004
Detecting Disease Associations due to Linkage Disequilibrium Using Haplotype Tags: A Class of Tests and the Determinants of Statistical Power
Human Heredity, 2003
Haplotypic analysis of the TNF locus by association efficiency and entropy
Genome Biology, 2003
Genome Association Studies of Complex Diseases by Case-Control Designs
American Journal of Human Genetics, 2003
Hierarchical Modeling of Linkage Disequilibrum: Genetic Structure and Spatial Relations
American Journal of Human Genetics, 2003
High-resolution haplotype structure in the human genome
Nature Genetics, 2001
Extent and Distribution of Linkage Disequilibrium in Three Genomic Regions
American Journal of Human Genetics, 2001
Fieller's theorem and linkage disequilibrium mapping
Genetic Epidemiology, 1999
Haplotype Structure and Population Genetic Inferences from Nucleotide-Sequence Variation in Human Lipoprotein Lipase
American Journal of Human Genetics, 1998