Data mining
- 8 December 2005
- journal article
- gaw14 workshop-summary
- Published by Wiley in Genetic Epidemiology
- Vol. 29 (S1) , S103-S109
- https://doi.org/10.1002/gepi.20117
Abstract
Group 14 used data-mining strategies to evaluate a number of issues, including appropriate diagnosis, haplotype estimation, genetic linkage and association studies, and type I error. Methods ranged from exploratory analyses, to machine learning strategies (neural networks, supervised learning, and tree-based methods), to false discovery rate control of type I errors. The general motivations were to find the “story” in the data and to summarize information from a multitude of measures. Several methods illustrated strategies for better trait definition, using summarization of related traits. In the few studies that sought to identify genes for alcoholism, there was little agreement among the different strategies, likely reflecting the complexities of the disease. Nevertheless, Group 14 found that these methods offered strategies to gain a better understanding of the complex pathways by which disease develops. Genet. Epidemiol. 29(Suppl. 1):S103–S109, 2005.Keywords
This publication has 18 references indexed in Scilit:
- Analysis of alcoholism data using support vector machinesBMC Genomic Data, 2005
- Data mining of the GAW14 simulated data using rough set theory and tree-based methodsBMC Genomic Data, 2005
- Boosting alternating decision trees modeling of disease trait informationBMC Genomic Data, 2005
- Diagnosis of alcoholism based on neural network analysis of phenotypic risk factorsBMC Genomic Data, 2005
- Power and type I error rate of false discovery rate approaches in genome-wide association studiesBMC Genomic Data, 2005
- Whole-genome association studies on alcoholism comparing different phenotypes using single-nucleotide polymorphisms and microsatellitesBMC Genomic Data, 2005
- An artificial neural network for estimating haplotype frequenciesBMC Genomic Data, 2005
- Large-Scale Simultaneous Hypothesis TestingJournal of the American Statistical Association, 2004
- A Direct Approach to False Discovery RatesJournal of the Royal Statistical Society Series B: Statistical Methodology, 2002
- A Unified Approach to Adjusting Association Tests for Population Admixture with Arbitrary Pedigree Structure and Arbitrary Missing Marker InformationHuman Heredity, 2000