New gene selection method for classification of cancer subtypes considering within‐class variation
Open Access
- 9 August 2003
- journal article
- research article
- Published by Wiley in FEBS Letters
- Vol. 551 (1-3) , 3-7
- https://doi.org/10.1016/s0014-5793(03)00819-6
Abstract
In this work we propose a new method for finding gene subsets of microarray data that effectively discriminates subtypes of disease. We developed a new criterion for measuring the relevance of individual genes by using mean and standard deviation of distances from each sample to the class centroid in order to treat the well‐known problem of gene selection, large within‐class variation. Also this approach has the advantage that it is applicable not only to binary classification but also to multiple classification problems. We demonstrated the performance of the method by applying it to the publicly available microarray datasets, leukemia (two classes) and small round blue cell tumors (four classes). The proposed method provides a very small number of genes compared with the previous methods without loss of discriminating power and thus it can effectively facilitate further biological and clinical researches.Keywords
This publication has 17 references indexed in Scilit:
- Cancer classification using gene expression dataInformation Systems, 2003
- Gene selection: a Bayesian variable selection approachBioinformatics, 2003
- Pattern identification and classification in gene expression data using an autoassociative neural network modelBiotechnology & Bioengineering, 2002
- Optimal Approach for Classification of Acute Leukemia Subtypes Based on Gene Expression DataBiotechnology Progress, 2002
- Diagnosis of multiple cancer types by shrunken centroids of gene expressionProceedings of the National Academy of Sciences, 2002
- Classification and diagnostic prediction of cancers using gene expression profiling and artificial neural networksNature Medicine, 2001
- Supervised harvesting of expression treesGenome Biology, 2001
- Tissue Classification with Gene Expression ProfilesJournal of Computational Biology, 2000
- Nonlinear Component Analysis as a Kernel Eigenvalue ProblemNeural Computation, 1998
- Non‐hodgkin lymphoma in common variable immunodeficiencyAmerican Journal of Hematology, 1991