An R Package for Analysis of Whole-Genome Association Studies
- 1 April 2007
- journal article
- Published by S. Karger AG in Human Heredity
- Vol. 64 (1) , 45-51
- https://doi.org/10.1159/000101422
Abstract
To provide data classes and methods to facilitate the analysis of whole genome association studies in the R language for statistical computing. We have implemented data classes in which each genotype call is stored as a single byte. At this density, data for single chromosomes derived from large studies and new high-throughput gene chip platforms can be handled in memory. We use the object-oriented programming model introduced with version 4 of the S-plus package, usually termed 'S4 methods'. At the current state of development the package only supports population-based studies, although we would hope to provide support for family-based studies soon. Both quantitative and qualitative phenotypes may be analysed. Flexible association testing functions are provided which can carry out single SNP tests which control for potential confounding by quantitative and qualitative covariates. Tests involving several SNPs taken together as 'tags' are also supported. Efficient calculation of pair-wise linkage disequilibrium measures is implemented and data input functions include a function which can download data directly from the international HapMap project website.Keywords
This publication has 15 references indexed in Scilit:
- Population structure, differential bias and genomic control in a large-scale, case-control association studyNature Genetics, 2005
- Use of unphased multilocus genotype data in indirect association studiesGenetic Epidemiology, 2004
- The International HapMap ProjectNature, 2003
- Detecting Disease Associations due to Linkage Disequilibrium Using Haplotype Tags: A Class of Tests and the Determinants of Statistical PowerHuman Heredity, 2003
- Genome Association Studies of Complex Diseases by Case-Control DesignsAmerican Journal of Human Genetics, 2003
- Generalized T2 Test for Genome Association StudiesAmerican Journal of Human Genetics, 2002
- Programming with DataPublished by Springer Nature ,1998
- A Comparison of Linkage Disequilibrium Measures for Fine-Scale MappingGenomics, 1995
- Logistic Disease Incidence Models and Case-Control StudiesBiometrika, 1979
- Generalized Linear ModelsJournal of the Royal Statistical Society. Series A (General), 1972