Tree-structured supervised learning and the genetics of hypertension

Open Access

12 July 2004

journal article
research article
Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences

Vol. 101 (29) , 10529-10534
https://doi.org/10.1073/pnas.0403794101

Abstract

This paper is about an algorithm, FlexTree, for general supervised learning. It extends the binary tree-structured approach (Classification and Regression Trees, CART) although it differs greatly in its selection and combination of predictors. It is particularly applicable to assessing interactions: gene by gene and gene by environment as they bear on complex disease. One model for predisposition to complex disease involves many genes. Of them, most are pure noise; each of the values that is not the prevalent genotype for the minority of genes that contribute to the signal carries a “score.” Scores add. Individuals with scores above an unknown threshold are predisposed to the disease. For the additive score problem and simulated data, FlexTree has cross-validated risk better than many cutting-edge technologies to which it was compared when small fractions of candidate genes carry the signal. For the model where only a precise list of aberrant genotypes is predisposing, there is not a systematic pattern of absolute superiority; however, overall, FlexTree seems better than the other technologies. We tried the algorithm on data from 563 Chinese women, 206 hypotensive, 357 hypertensive, with information on ethnicity, menopausal status, insulin-resistant status, and 21 loci. FlexTree and Logic Regression appear better than the others in terms of Bayes risk. However, the differences are not significant in the usual statistical sense.

Keywords

This publication has 22 references indexed in Scilit:

SmcHD1, containing a structural-maintenance-of-chromosomes hinge domain, has a critical role in X inactivation
Nature Genetics, 2008
Overexpression of Protein-Tyrosine Phosphatase PTPσ Is Linked to Impaired Glucose-Induced Insulin Secretion in Hereditary Diabetic Goto-Kakizaki Rats
Biochemical and Biophysical Research Communications, 2002
Nephrogenic Diabetes Insipidus
Annual Review of Physiology, 2001
Increased Insulin Sensitivity and Obesity Resistance in Mice Lacking the Protein Tyrosine Phosphatase-1B Gene
Science, 1999
Classification Trees for Multiple Binary Responses
Journal of the American Statistical Association, 1998
Classification Trees for Multiple Binary Responses
Journal of the American Statistical Association, 1998
Molecular Genetics of Human Blood Pressure Variation
Science, 1996
Angiotensin II type 1 receptor gene polymorphisms in human essential hypertension.
Hypertension, 1994
Tree-Structured Classification Via Generalized Discriminant Analysis
Journal of the American Statistical Association, 1988
Tree-Structured Classification via Generalized Discriminant Analysis
Journal of the American Statistical Association, 1988