Analysis of case-control studies of genetic and environmental factors with missing genetic information and haplotype-phase ambiguity
- 10 August 2005
- journal article
- research article
- Published by Wiley in Genetic Epidemiology
- Vol. 29 (2) , 108-127
- https://doi.org/10.1002/gepi.20085
Abstract
Case‐control studies of unrelated subjects are now widely used to study the role of genetic susceptibility and gene‐environment interactions in the etiology of complex diseases. Exploiting an assumption of gene‐environment independence, and treating the distribution of environmental exposures as completely nonparametric, Chatterjee and Carroll [2005] (Biometrika 92:399–418) recently developed an efficient retrospective maximum‐likelihood method for analysis of case‐control studies. In this article, we develop an extension of the retrospective maximum‐likelihood approach to studies where genetic information may be missing on some study subjects. In particular, special emphasis is given to haplotype‐based studies where missing data arise due to linkage‐phase ambiguity of genotype data. We use a profile likelihood technique and an appropriate expectation‐maximization (EM) algorithm to derive a relatively simple procedure for parameter estimation, with or without a rare disease assumption, and possibly incorporating information on the marginal probability of the disease for the underlying population. We also describe two alternative robust approaches that are less sensitive to the underlying gene‐environment independence and Hardy‐Weinberg‐equilibrium assumptions. The performance of the proposed methods is studied using simulation studies in the context of haplotype‐based studies of gene‐environment interactions. An application of the proposed method is illustrated using a case‐control study of ovarian cancer designed to investigate the interaction between BRCA1/2 mutations and reproductive risk factors in the etiology of ovarian cancer. Genet. Epidemiol., 2005. Published 2005 Wiley‐Liss, Inc.Keywords
This publication has 14 references indexed in Scilit:
- Semiparametric maximum likelihood estimation exploiting gene-environment independence in case-control studiesBiometrika, 2005
- Evaluating associations of haplotypes with traitsGenetic Epidemiology, 2004
- Comparison of prospective and retrospective methods for haplotype inference in case-control studiesGenetic Epidemiology, 2004
- Inference on Haplotype Effects in Case-Control Studies Using Unphased Genotype DataAmerican Journal of Human Genetics, 2003
- Modeling and E-M Estimation of Haplotype-Specific Relative Risks from Genotype Data for a Case-Control Study of Unrelated IndividualsHuman Heredity, 2003
- Estimation and Tests of Haplotype-Environment Interaction when Linkage Phase Is AmbiguousHuman Heredity, 2003
- A Method for the Assessment of Disease Associations with Single-Nucleotide Polymorphism Haplotypes and Environmental Variables in Case-Control StudiesAmerican Journal of Human Genetics, 2003
- Parity, Oral Contraceptives, and the Risk of Ovarian Cancer among Carriers and Noncarriers of aBRCA1orBRCA2MutationNew England Journal of Medicine, 2001
- A Semiparametric Mixture Approach to Case-Control Studies with Errors in CovariablesJournal of the American Statistical Association, 1996
- Logistic disease incidence models and case-control studiesBiometrika, 1979