Blind Source Separation and the Analysis of Microarray Data
- 1 December 2004
- journal article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 11 (6) , 1090-1109
- https://doi.org/10.1089/cmb.2004.11.1090
Abstract
We develop an approach for the exploratory analysis of gene expression data, based upon blind source separation techniques. This approach exploits higher-order statistics to identify a linear model for (logarithms of) expression profiles, described as linear combinations of "independent sources." As a result, it yields "elementary expression patterns" (the "sources"), which may be interpreted as potential regulation pathways. Further analysis of the so-obtained sources show that they are generally characterized by a small number of specific coexpressed or antiexpressed genes. In addition, the projections of the expression profiles onto the estimated sources often provides significant clustering of conditions. The algorithm relies on a large number of runs of "independent component analysis" with random initializations, followed by a search of "consensus sources." It then provides estimates for independent sources, together with an assessment of their robustness. The results obtained on two datasets (namely, breast cancer data and Bacillus subtilis sulfur metabolism data) show that some of the obtained gene families correspond to well known families of coregulated genes, which validates the proposed approach.Keywords
This publication has 16 references indexed in Scilit:
- Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organismsProceedings of the National Academy of Sciences, 2003
- A variance-stabilizing transformation for gene-expression microarray dataBioinformatics, 2002
- Gene expression profiles of poor-prognosis primary breast cancer correlate with survivalHuman Molecular Genetics, 2002
- GeneANOVA—gene expression analysis of varianceBioinformatics, 2002
- Class discovery in gene expression dataPublished by Association for Computing Machinery (ACM) ,2001
- Gene expression profiling of primary breast carcinomas using arrays of candidate genesHuman Molecular Genetics, 2000
- Clustering Gene Expression PatternsJournal of Computational Biology, 1999
- Fast and robust fixed-point algorithms for independent component analysisIEEE Transactions on Neural Networks, 1999
- Cluster analysis and display of genome-wide expression patternsProceedings of the National Academy of Sciences, 1998
- Blind signal separation: statistical principlesProceedings of the IEEE, 1998