Sample size calculations for population‐ and family‐based case‐control association studies on marker genotypes

Abstract
Most previous sample size calculations for case‐control studies to detect genetic associations with disease assumed that the disease gene locus is known, whereas, in fact, markers are used. We calculated sample sizes for unmatched case‐control and sibling case‐control studies to detect an association between a biallelic marker and a disease governed by a putative biallelic disease locus. Required sample sizes increase with increasing discrepancy between the marker and disease allele frequencies, and with less‐than‐maximal linkage disequilibrium between the marker and disease alleles. Qualitatively similar results were found for studies of parent offspring triads based on the transmission disequilibrium test (Abel and Müller‐Myhsok, 1998, Am. J. Hum. Genet. 63:664–667; Tu and Whittemore, 1999, Am. J. Hum. Genet. 64:641–649). We also studied other factors affecting required sample size, including attributable risk for the disease allele, inheritance mechanism, disease prevalence, and for sibling case‐control designs, extragenetic familial aggregation of disease and recombination. The large sample‐size requirements represent a formidable challenge to studies of this type. Genet Epidemiol 25:136–148, 2003. Published 2003 Wiley‐Liss, Inc.