Ensemble dependence model for classification and prediction of cancer and normal gene expression data
Open Access
- 6 May 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (14) , 3114-3121
- https://doi.org/10.1093/bioinformatics/bti483
Abstract
Motivation: DNA microarray technologies make it possible to simultaneously monitor thousands of genes' expression levels. A topic of great interest is to study the different expression profiles between microarray samples from cancer patients and normal subjects, by classifying them at gene expression levels. Currently, various clustering methods have been proposed in the literature to classify cancer and normal samples based on microarray data, and they are predominantly data-driven approaches. In this paper, we propose an alternative approach, a model-driven approach, which can reveal the relationship between the global gene expression profile and the subject's health status, and thus is promising in predicting the early development of cancer. Results: In this work, we propose an ensemble dependence model, aimed at exploring the group dependence relationship of gene clusters. Under the framework of hypothesis-testing, we employ genes' dependence relationship as a feature to model and classify cancer and normal samples. The proposed classification scheme is applied to several real cancer datasets, including cDNA, Affymetrix microarray and proteomic data. It is noted that the proposed method yields very promising performance. We further investigate the eigenvalue pattern of the proposed method, and we discover different patterns between cancer and normal samples. Moreover, the transition between cancer and normal patterns suggests that the eigenvalue pattern of the proposed models may have potential to predict the early stage of cancer development. In addition, we examine the effects of possible model mismatch on the proposed scheme. Availability: see Supplemental website at http://dsplab.eng.umd.edu/~genomics/edm Contact:qiupeng@umd.eduKeywords
This publication has 12 references indexed in Scilit:
- Optimization of neural network architecture using genetic programming improves detection and modeling of gene-gene interactions in studies of human diseasesBMC Bioinformatics, 2003
- Effective dimension reduction methods for tumor classification using gene expression dataBioinformatics, 2003
- An Analytical Method for Multiclass Molecular Cancer ClassificationSIAM Review, 2003
- Caspase 3-cleaved N-terminal fragments of wild-type and mutant huntingtin are present in normal and Huntington's disease brains, associate with membranes, and undergo calpain-dependent proteolysisProceedings of the National Academy of Sciences, 2001
- Delineation of prognostic biomarkers in prostate cancerNature, 2001
- Speaker recognition using continuous density supportvector machinesElectronics Letters, 2001
- Support vector machine classification and validation of cancer tissue samples using microarray expression dataBioinformatics, 2000
- Genomics, gene expression and DNA arraysNature, 2000
- Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression MonitoringScience, 1999
- Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arraysProceedings of the National Academy of Sciences, 1999