Simultaneous analysis of distinct Omics data sets with integration of biological knowledge: Multiple Factor Analysis approach
Open Access
- 20 January 2009
- journal article
- research article
- Published by Springer Nature in BMC Genomics
- Vol. 10 (1) , 1-17
- https://doi.org/10.1186/1471-2164-10-32
Abstract
Background: Genomic analysis will greatly benefit from considering in a global way various sources of molecular data with the related biological knowledge. It is thus of great importance to provide useful integrative approaches dedicated to ease the interpretation of microarray data. Results: Here, we introduce a data-mining approach, Multiple Factor Analysis (MFA), to combine multiple data sets and to add formalized knowledge. MFA is used to jointly analyse the structure emerging from genomic and transcriptomic data sets. The common structures are underlined and graphical outputs are provided such that biological meaning becomes easily retrievable. Gene Ontology terms are used to build gene modules that are superimposed on the experimentally interpreted plots. Functional interpretations are then supported by a step-by-step sequence of graphical representations. Conclusion: When applied to genomic and transcriptomic data and associated Gene Ontology annotations, our method prioritize the biological processes linked to the experimental settings. Furthermore, it reduces the time and effort to analyze large amounts of 'Omics' data.Keywords
This publication has 29 references indexed in Scilit:
- A multivariate analysis approach to the integration of proteomic and gene expression dataProteomics, 2007
- Integrating transcription factor binding site information with gene expression datasetsBioinformatics, 2006
- Highly parallel genomic assaysNature Reviews Genetics, 2006
- Cell Type and Culture Condition–Dependent Alternative Splicing in Human Breast Cancer Cells Revealed by Splicing-Sensitive MicroarraysCancer Research, 2006
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences, 2005
- MADE4: an R package for multivariate analysis of gene expression dataBioinformatics, 2005
- Differential domain evolution and complex RNA processing in a family of paralogous EPB41 (protein 4.1) genes facilitate expression of diverse tissue-specific isoformsGenomics, 2004
- Morphologic and molecular genetic aspects of oligodendroglial neoplasmsNeuro-Oncology, 1999
- EditorialNeuro-Oncology, 1999
- Canonical ridge and econometrics of joint productionJournal of Econometrics, 1976