Genome-wide association studies for common diseases and complex traits

Abstract
Genome-wide association studies are rapidly becoming feasible as an approach for identifying the genes that underlie common diseases and related quantitative traits. This strategy combines a comprehensive and unbiased survey of the genome with the power to detect common alleles with modest phenotypic effects. Sets of markers for genome-wide association studies can be chosen using various criteria, but the degree to which a particular marker set actually surveys the genome should be evaluated if the label “genome-wide association” is to be applied. Empirical assessments of linkage disequilibrium patterns, such as those that are being performed in the HapMap project, will enable the selection of efficient sets of markers and the evaluation of the comprehensiveness of a given marker set. Study design and interpretation of results must include appropriate statistical thresholds that take multiple-hypothesis testing into account, as can be achieved, for example, by permutation testing. Balancing the need for power to detect modest effects with the cost of genotyping large numbers of markers will probably require a multi-stage design. False-positive results that arise due to population stratification might outnumber true associations, and population stratification should be assessed and corrected for, if needed. Alternatively, family-based designs can be used, but high-quality data are needed to avoid artifacts that are specific to these designs. Gene–gene and gene–environment interactions might be common in complex traits, but unbounded searches for such interactions are unlikely to retain adequate power in studies of hundreds of thousands of markers. Either new methods will be required, or, alternatively, markers with individual effects will need to be identified first, followed by focused searches for interactions. Genome-wide association studies are likely to become a reality in the near future. Care will be required in their design, performance, analysis and interpretation, and well-conceived pilot studies might be valuable for understanding and minimizing the pitfalls of this approach. Nevertheless, genome-wide association studies have the potential to identify many genes for common diseases and quantitative traits.