Independent component analysis-based penalized discriminant method for tumor classification using gene expression data
Open Access
- 18 May 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 22 (15) , 1855-1862
- https://doi.org/10.1093/bioinformatics/btl190
Abstract
Motivation: Microarrays are capable of determining the expression levels of thousands of genes simultaneously. One important application of gene expression data is classification of samples into categories. In combination with classification methods, this technology can be useful to support clinical management decisions for individual patients, e.g. in oncology. Standard statistic methodologies in classification or prediction do not work well when the number of variables p (genes) far too exceeds the number of samples n. So, modification of existing statistical methodologies or development of new methodologies is needed for the analysis of microarray data. Results: This paper proposes a new method for tumor classification using gene expression data. In this method, we first employ independent component analysis to model the gene expression data, then apply optimal scoring algorithm to classify them. Further speaking, this approach can first make full use of the high-order statistical information contained in the gene expression data. Second, this approach also employs regularized regression models to handle the situation of large numbers of correlated predictor variables. Finally, the predictive models are developed for classifying tumors based on the entire gene expression profile. To show the validity of the proposed method, we apply it to classify four DNA microarray datasets involving various human normal and tumor tissue samples. The experimental results show that the method is efficient and feasible. Availability: Matlab scripts are available on request. Contact:dshuang@iim.ac.cnKeywords
This publication has 31 references indexed in Scilit:
- Nonnegative independent component analysis based on minimizing mutual information techniqueNeurocomputing, 2006
- A variational Bayesian mixture modelling framework for cluster analysis of gene-expression dataBioinformatics, 2005
- Face recognition by independent component analysisIEEE Transactions on Neural Networks, 2002
- Gene expression profiling predicts clinical outcome of breast cancerNature, 2002
- Distinct types of diffuse large B-cell lymphoma identified by gene expression profilingNature, 2000
- Fast and robust fixed-point algorithms for independent component analysisIEEE Transactions on Neural Networks, 1999
- Penalized Discriminant AnalysisThe Annals of Statistics, 1995
- Flexible Discriminant Analysis by Optimal ScoringJournal of the American Statistical Association, 1994
- A Statistical View of Some Chemometrics Regression ToolsTechnometrics, 1993
- Principal Components Regression in Exploratory Statistical ResearchJournal of the American Statistical Association, 1965