New gene selection method for classification of cancer subtypes considering within‐class variation

Open Access

9 August 2003

journal article
research article
Published by Wiley in FEBS Letters

Vol. 551 (1-3) , 3-7
https://doi.org/10.1016/s0014-5793(03)00819-6

Abstract

In this work we propose a new method for finding gene subsets of microarray data that effectively discriminates subtypes of disease. We developed a new criterion for measuring the relevance of individual genes by using mean and standard deviation of distances from each sample to the class centroid in order to treat the well‐known problem of gene selection, large within‐class variation. Also this approach has the advantage that it is applicable not only to binary classification but also to multiple classification problems. We demonstrated the performance of the method by applying it to the publicly available microarray datasets, leukemia (two classes) and small round blue cell tumors (four classes). The proposed method provides a very small number of genes compared with the previous methods without loss of discriminating power and thus it can effectively facilitate further biological and clinical researches.

Keywords

This publication has 17 references indexed in Scilit:

Cancer classification using gene expression data
Information Systems, 2003
Gene selection: a Bayesian variable selection approach
Bioinformatics, 2003
Pattern identification and classification in gene expression data using an autoassociative neural network model
Biotechnology & Bioengineering, 2002
Optimal Approach for Classification of Acute Leukemia Subtypes Based on Gene Expression Data
Biotechnology Progress, 2002
Diagnosis of multiple cancer types by shrunken centroids of gene expression
Proceedings of the National Academy of Sciences, 2002
Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networks
Nature Medicine, 2001
Supervised harvesting of expression trees
Genome Biology, 2001
Tissue Classification with Gene Expression Profiles
Journal of Computational Biology, 2000
Nonlinear Component Analysis as a Kernel Eigenvalue Problem
Neural Computation, 1998
Non‐hodgkin lymphoma in common variable immunodeficiency
American Journal of Hematology, 1991