Pattern identification and classification in gene expression data using an autoassociative neural network model
- 23 December 2002
- journal article
- research article
- Published by Wiley in Biotechnology & Bioengineering
- Vol. 81 (5) , 594-606
- https://doi.org/10.1002/bit.10505
Abstract
The application of DNA microarray technology for analysis of gene expression creates enormous opportunities to accelerate the pace in understanding living systems and identification of target genes and pathways for drug development and therapeutic intervention. Parallel monitoring of the expression profiles of thousands of genes seems particularly promising for a deeper understanding of cancer biology and the identification of molecular signatures supporting the histological classification schemes of neoplastic specimens. However, the increasing volume of data generated by microarray experiments poses the challenge of developing equally efficient methods and analysis procedures to extract, interpret, and upgrade the information content of these databases. Herein, a computational procedure for pattern identification, feature extraction, and classification of gene expression data through the analysis of an autoassociative neural network model is described. The identified patterns and features contain critical information about gene–phenotype relationships observed during changes in cell physiology. They represent a rational and dimensionally reduced base for understanding the basic biology of the onset of diseases, defining targets of therapeutic intervention, and developing diagnostic tools for the identification and classification of pathological states. The proposed method has been tested on two different microarray datasets—Golub's analysis of acute human leukemia [Golub et al. (1999) Science 286:531–537], and the human colon adenocarcinoma study presented by Alon et al. [1999; Proc Natl Acad Sci USA 97:10101–10106]. The analysis of the neural network internal structure allows the identification of specific phenotype markers and the extraction of peculiar associations among genes and physiological states. At the same time, the neural network outputs provide assignment to multiple classes, such as different pathological conditions or tissue samples, for previously unseen instances. © 2003 Wiley Periodicals, Inc. Biotechnol Bioeng 81: 594–606, 2003.Keywords
This publication has 43 references indexed in Scilit:
- Prediction of central nervous system embryonal tumour outcome based on gene expressionNature, 2002
- MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemiaNature Genetics, 2001
- Precursor B-Cell Lymphoblastic LymphomaAmerican Journal of Clinical Pathology, 2001
- Coupled two-way clustering analysis of gene microarray dataProceedings of the National Academy of Sciences, 2000
- Gene expression data analysisFEBS Letters, 2000
- Tissue Classification with Gene Expression ProfilesJournal of Computational Biology, 2000
- Mining of Biological Data I: Identifying Discriminating Features Via Mean Hypothesis TestingMetabolic Engineering, 2000
- Multivariate Measurement of Gene Expression RelationshipsGenomics, 2000
- Nonlinear Autoassociation Is Not Equivalent to PCANeural Computation, 2000
- Bioprocess Fault Detection by Nonlinear Multivariate Analysis: Application of an Artificial Autoassociative Neural Network and Wavelet Filter BankBiotechnology Progress, 1998