PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations
Top Cited Papers
Open Access
- 24 March 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 26 (9) , 1205-1210
- https://doi.org/10.1093/bioinformatics/btq126
Abstract
Motivation: Emergence of genetic data coupled to longitudinal electronic medical records (EMRs) offers the possibility of phenome-wide association scans (PheWAS) for disease–gene associations. We propose a novel method to scan phenomic data for genetic associations using International Classification of Disease (ICD9) billing codes, which are available in most EMR systems. We have developed a code translation table to automatically define 776 different disease populations and their controls using prevalent ICD9 codes derived from EMR data. As a proof of concept of this algorithm, we genotyped the first 6005 European–Americans accrued into BioVU, Vanderbilt's DNA biobank, at five single nucleotide polymorphisms (SNPs) with previously reported disease associations: atrial fibrillation, Crohn's disease, carotid artery stenosis, coronary artery disease, multiple sclerosis, systemic lupus erythematosus and rheumatoid arthritis. The PheWAS software generated cases and control populations across all ICD9 code groups for each of these five SNPs, and disease-SNP associations were analyzed. The primary outcome of this study was replication of seven previously known SNP–disease associations for these SNPs. Results: Four of seven known SNP–disease associations using the PheWAS algorithm were replicated with P-values between 2.8 × 10−6 and 0.011. The PheWAS algorithm also identified 19 previously unknown statistical associations between these SNPs and diseases at P < 0.01. This study indicates that PheWAS analysis is a feasible method to investigate SNP–disease associations. Further evaluation is needed to determine the validity of these associations and the appropriate statistical thresholds for clinical significance. Availability:The PheWAS software and code translation table are freely available at http://knowledgemap.mc.vanderbilt.edu/research. Contact:josh.denny@vanderbilt.eduKeywords
This publication has 22 references indexed in Scilit:
- Potential etiologic and functional implications of genome-wide association loci for human diseases and traitsProceedings of the National Academy of Sciences, 2009
- Acid-Suppressive Medication Use and the Risk for Hospital-Acquired PneumoniaJAMA, 2009
- Active Computerized Pharmacovigilance Using Natural Language Processing, Statistics, and Electronic Health Records: A Feasibility StudyJournal of the American Medical Informatics Association, 2009
- Large Scale Association Analysis of Novel Genetic Loci for Coronary Artery DiseaseArteriosclerosis, Thrombosis, and Vascular Biology, 2009
- Collaborative Genome-Wide Association Studies of Diverse Diseases: Programs of the NHGRI‘s Office of Population GenomicsPharmacogenomics, 2009
- Phenomics: the systematic study of phenotypes on a genome-wide scaleNeuroscience, 2009
- Automated Identification of Acute Hepatitis B Using Electronic Medical Record Data to Facilitate Public Health SurveillancePLOS ONE, 2008
- Genome-wide association with select biomarker traits in the Framingham Heart StudyBMC Medical Genetics, 2007
- Probing genetic overlap among complex human phenotypesProceedings of the National Academy of Sciences, 2007
- Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controlsNature, 2007