Microarray-based gene set analysis: a comparison of current methods
Open Access
- 27 November 2008
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 9 (1) , 1-14
- https://doi.org/10.1186/1471-2105-9-502
Abstract
The analysis of gene sets has become a popular topic in recent times, with researchers attempting to improve the interpretability and reproducibility of their microarray analyses through the inclusion of supplementary biological information. While a number of options for gene set analysis exist, no consensus has yet been reached regarding which methodology performs best, and under what conditions. The goal of this work was to examine the performance characteristics of a collection of existing gene set analysis methods, on both simulated and real microarray data sets. Of particular interest was the potential utility gained through the incorporation of inter-gene correlation into the analysis process. Each of six gene set analysis methods was applied to both simulated and publicly available microarray data sets. Overall, the various methodologies were all found to be better at detecting gene sets that moved from non-active (i.e., genes not expressed) to active states (or vice versa), rather than those that simply changed their level of activity. Methods which incorporate correlation structures were found to provide increased ability to detect altered gene sets in some settings. Based on the results obtained through the analysis of simulated data, it is clear that the performance of gene set analysis methods is strongly influenced by the features of the data set in question, and that methods which incorporate correlation structures into the analysis process tend to achieve better performance, relative to methods which rely on univariate test statistics.Keywords
This publication has 26 references indexed in Scilit:
- Comparative evaluation of gene-set analysis methodsBMC Bioinformatics, 2007
- Improving gene set analysis of microarray data by SAM-GSBMC Bioinformatics, 2007
- A multivariate approach for integrating genome-wide expression data and biological knowledgeBioinformatics, 2006
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences, 2005
- Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray ExperimentsStatistical Applications in Genetics and Molecular Biology, 2004
- PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetesNature Genetics, 2003
- GenMAPP, a new tool for viewing and analyzing microarray data on biological pathwaysNature Genetics, 2002
- The control of the false discovery rate in multiple testing under dependencyThe Annals of Statistics, 2001
- Significance analysis of microarrays applied to the ionizing radiation responseProceedings of the National Academy of Sciences, 2001
- KEGG: Kyoto Encyclopedia of Genes and GenomesNucleic Acids Research, 2000