Block principal component analysis with application to gene microarray data classification
- 24 October 2002
- journal article
- research article
- Published by Wiley in Statistics in Medicine
- Vol. 21 (22) , 3465-3474
- https://doi.org/10.1002/sim.1263
Abstract
We propose a block principal component analysis method for extracting information from a database with a large number of variables and a relatively small number of subjects, such as a microarray gene expression database. This new procedure has the advantage of computational simplicity, and theory and numerical results demonstrate it to be as efficient as the ordinary principal component analysis when used for dimension reduction, variable selection and data visualization and classification. The method is illustrated with the well-known National Cancer Institute database of 60 human cancer cell lines data (NCI60) of gene microarray expressions, in the context of classification of cancer cell lines. Copyright © 2002 John Wiley & Sons, Ltd.Keywords
This publication has 7 references indexed in Scilit:
- A gene expression database for the molecular pharmacology of cancerNature Genetics, 2000
- Statistical Analysis of Array Expression Data as Applied to the Problem of Tamoxifen ResistanceJNCI Journal of the National Cancer Institute, 1999
- Making and reading microarraysNature Genetics, 1999
- Exploring the new world of the genome with DNA microarraysNature Genetics, 1999
- Pattern Recognition and Neural NetworksPublished by Cambridge University Press (CUP) ,1996
- Principal Component AnalysisPublished by Springer Nature ,1986
- An Examination of Procedures for Determining the Number of Clusters in a Data SetPsychometrika, 1985