Integrated modeling of clinical and gene expression information for personalized prediction of disease outcomes
- 19 May 2004
- journal article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 101 (22) , 8431-8436
- https://doi.org/10.1073/pnas.0401736101
Abstract
We describe a comprehensive modeling approach to combining genomic and clinical data for personalized prediction in disease outcome studies. This integrated clinicogenomic modeling framework is based on statistical classification tree models that evaluate the contributions of multiple forms of data, both clinical and genomic, to define interactions of multiple risk factors that associate with the clinical outcome and derive predictions customized to the individual patient level. Gene expression data from DNA microarrays is represented by multiple, summary measures that we term metagenes; each metagene characterizes the dominant common expression pattern within a cluster of genes. A case study of primary breast cancer recurrence demonstrates that models using multiple metagenes combined with traditional clinical risk factors improve prediction accuracy at the individual patient level, delivering predictions more accurate than those made by using a single genomic predictor or clinical data alone. The analysis also highlights issues of communicating uncertainty in prediction and identifies combinations of clinical and genomic risk factors playing predictive roles. Implicated metagenes identify gene subsets with the potential to aid biological interpretation. This framework will extend to incorporate any form of data, including emerging forms of genomic data, and provides a platform for development of models for personalized prognosis.Keywords
This publication has 27 references indexed in Scilit:
- Gene expression phenotypic models that predict the activity of oncogenic pathwaysNature Genetics, 2003
- A Gene-Expression Signature as a Predictor of Survival in Breast CancerNew England Journal of Medicine, 2002
- Gene expression profiling predicts clinical outcome of breast cancerNature, 2002
- Prediction of central nervous system embryonal tumour outcome based on gene expressionNature, 2002
- Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implicationsProceedings of the National Academy of Sciences, 2001
- Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author)Statistical Science, 2001
- Distinct types of diffuse large B-cell lymphoma identified by gene expression profilingNature, 2000
- Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authorsStatistical Science, 1999
- Bayesian CART Model SearchJournal of the American Statistical Association, 1998
- Prognostic factors for recurrence and survival in human breast cancerBreast Cancer Research and Treatment, 1987