Testing Untyped Alleles (TUNA)—applications to genome‐wide association studies
Open Access
- 19 September 2006
- journal article
- research article
- Published by Wiley in Genetic Epidemiology
- Vol. 30 (8) , 718-727
- https://doi.org/10.1002/gepi.20182
Abstract
The large number of tests performed in analyzing data from genome-wide association studies has a large impact on the power of detecting risk variants, and analytic strategies specifying the optimal set of hypotheses to be tested are necessary. We propose a genome-wide strategy that is based on one degree of freedom tests for all the genotyped variants, and for all the untyped variants for which there is sufficient information in the observed data. The set of untyped variants to be tested is found using multi-locus measures of linkage disequilibrium and haplotype frequencies from a reference database such as HapMap (The International HapMap Consortium [2003] Nature 426:789–796). We introduce a novel statistic for testing differences in allele frequencies for untyped variation that is based on linear combinations of estimable haplotype frequencies. Algorithms for finding the sets of genotyped markers to be used in testing an untyped allele, and ways of incorporating haplotypes observed in the study data but not in the reference database are also described. The proposed testing strategy can be used as the first step in the analysis of genome-wide association data, and, because every performed test is directed to a marker, it can be used to specify the set of polymorphisms to genotype in follow-up studies. The described methodology provides also a tool for joint analysis of data from studies done on different platforms. Genet. Epidemiol. 2006.Keywords
This publication has 22 references indexed in Scilit:
- Determinants of the success of whole-genome association testingGenome Research, 2005
- Efficiency and power in genetic association studiesNature Genetics, 2005
- Genome-wide association studies for common diseases and complex traitsNature Reviews Genetics, 2005
- A single-nucleotide polymorphism tagging set for human drug metabolism and transportNature Genetics, 2004
- Genome scans and candidate gene approaches in the study of common diseases and variable drug responsesTrends in Genetics, 2003
- Detecting Disease Associations due to Linkage Disequilibrium Using Haplotype Tags: A Class of Tests and the Determinants of Statistical PowerHuman Heredity, 2003
- HAPLO: A Program Using the EM Algorithm to Estimate the Frequencies of Multi-site HaplotypesJournal of Heredity, 1995
- Assessing the accuracy of the maximum likelihood estimator: Observed versus expected Fisher informationBiometrika, 1978
- A Bayesian analysis of the minimum AIC procedureAnnals of the Institute of Statistical Mathematics, 1978
- A new look at the statistical model identificationIEEE Transactions on Automatic Control, 1974