Elucidating the Altered Transcriptional Programs in Breast Cancer using Independent Component Analysis
Open Access
- 17 August 2007
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 3 (8) , e161
- https://doi.org/10.1371/journal.pcbi.0030161
Abstract
The quantity of mRNA transcripts in a cell is determined by a complex interplay of cooperative and counteracting biological processes. Independent Component Analysis (ICA) is one of a few number of unsupervised algorithms that have been applied to microarray gene expression data in an attempt to understand phenotype differences in terms of changes in the activation/inhibition patterns of biological pathways. While the ICA model has been shown to outperform other linear representations of the data such as Principal Components Analysis (PCA), a validation using explicit pathway and regulatory element information has not yet been performed. We apply a range of popular ICA algorithms to six of the largest microarray cancer datasets and use pathway-knowledge and regulatory-element databases for validation. We show that ICA outperforms PCA and clustering-based methods in that ICA components map closer to known cancer-related pathways, regulatory modules, and cancer phenotypes. Furthermore, we identify cancer signalling and oncogenic pathways and regulatory modules that play a prominent role in breast cancer and relate the differential activation patterns of these to breast cancer phenotypes. Importantly, we find novel associations linking immune response and epithelial–mesenchymal transition pathways with estrogen receptor status and histological grade, respectively. In addition, we find associations linking the activity levels of biological pathways and transcription factors (NF1 and NFAT) with clinical outcome in breast cancer. ICA provides a framework for a more biologically relevant interpretation of genomewide transcriptomic data. Adopting ICA as the analysis tool of choice will help understand the phenotype–pathway relationship and thus help elucidate the molecular taxonomy of heterogeneous cancers and of other complex genetic diseases. The amount of a given transcript or protein in a cell is determined by a balance of expression and repression in a complex network of biological processes. This delicate balance is compromised in complex genetic diseases such as cancer by alterations in the activation patterns of functionally important biological processes known as pathways. Over the last years, a large number of microarray experiments profiling the expression levels of more than 20,000 human genes in hundreds of tumor samples have shown that most cancer types are heterogeneous diseases, each characterized by many different expression subtypes. The biological and clinical goal is to explain the observed tumor and clinical heterogeneity in terms of specific patterns of altered pathways. The bioinformatic challenge is therefore to devise mathematical tools that explicitly attempt to infer these altered pathways. To this end, we applied a signal processing tool in a meta-analysis of breast cancer, encompassing more than 800 tumor specimens derived from four different patient cohorts, and showed that this algorithm significantly outperforms popular standard bioinformatics tools in identifying altered pathways underlying breast cancer. These results show that the same tool could be applied to other complex human genetic diseases to better elucidate the underlying altered pathways.Keywords
This publication has 61 references indexed in Scilit:
- Women with neurofibromatosis 1 are at a moderately increased risk of developing breast cancer and should be considered for early screeningJournal of Medical Genetics, 2007
- Regional copy number–independent deregulation of transcription in cancerNature Genetics, 2006
- Wnt-5a/Ca2+-Induced NFAT Activity Is Counteracted by Wnt-5a/Yes-Cdc42-Casein Kinase 1α Signaling in Human Mammary Epithelial CellsMolecular and Cellular Biology, 2006
- Ets2 transcription factor in normal and neoplastic human breast tissueEuropean Journal Of Cancer, 2006
- Oncogenic pathway signatures in human cancers as a guide to targeted therapiesNature, 2005
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences, 2005
- Mining for regulatory programs in the cancer transcriptomeNature Genetics, 2005
- Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammalsNature, 2005
- Breast cancer classification and prognosis based on gene expression profiles from a population-based studyProceedings of the National Academy of Sciences, 2003
- A Gene-Expression Signature as a Predictor of Survival in Breast CancerNew England Journal of Medicine, 2002