A multivariate analysis approach to the integration of proteomic and gene expression data
- 26 June 2007
- journal article
- research article
- Published by Wiley in Proteomics
- Vol. 7 (13) , 2162-2171
- https://doi.org/10.1002/pmic.200600898
Abstract
In order to understand even the simplest cellular processes, we need to integrate proteomic, gene expression and other biomolecular data. To date, most computational approaches aimed at integrating proteomics and gene expression data use direct gene/protein correlation measures. However, due to post‐transcriptional and translational regulations, the correspondence between the expression of a gene and its protein is complicated. We apply a multivariate statistical method, co‐inertia analysis (CIA), to visualise gene and proteomic expression data stemming from the same biological samples. Principal components analysis or correspondence analysis can be used for data exploration on single datasets. CIA is then used to explore the relationships between two or more datasets. We further explore the data by projecting gene ontology (GO) information onto these plots to describe the cellular processes in action. We apply these techniques to gene expression and protein abundance data from studies of the human malarial parasite life cycle and the NCI‐60 cancer cell lines. In each case, we visualise gene expression, protein abundance and GO classes in the same low dimensional projections and identify GO classes that are likely to be of biological importance.Keywords
This publication has 56 references indexed in Scilit:
- Quantitative Proteomic and Genomic Profiling Reveals Metastasis-Related Protein Expression Patterns in Gastric Cancer CellsJournal of Proteome Research, 2006
- Regulation of Sexual Development of Plasmodium by Translational RepressionScience, 2006
- Integrated analysis of transcriptomic and proteomic data of Desulfovibrio vulgaris: zero-inflated Poisson regression models to predict abundance of undetected proteinsBioinformatics, 2006
- Array of Informatics: Applications in Modern ResearchJournal of Proteome Research, 2006
- MADE4: an R package for multivariate analysis of gene expression dataBioinformatics, 2005
- Multiplexed Protein Quantitation in Saccharomyces cerevisiae Using Amine-reactive Isobaric Tagging ReagentsMolecular & Cellular Proteomics, 2004
- The ENCODE (ENCyclopedia Of DNA Elements) ProjectScience, 2004
- Discovery of Gene Function by Expression Profiling of the Malaria Parasite Life CycleScience, 2003
- The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium falciparumPLoS Biology, 2003
- Systematic variation in gene expression patterns in human cancer cell linesNature Genetics, 2000