Discovering statistically significant pathways in expression profiling studies
Top Cited Papers
- 8 September 2005
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 102 (38) , 13544-13549
- https://doi.org/10.1073/pnas.0506577102
Abstract
Accurate and rapid identification of perturbed pathways through the analysis of genome-wide expression profiles facilitates the generation of biological hypotheses. We propose a statistical framework for determining whether a specified group of genes for a pathway has a coordinated association with a phenotype of interest. Several issues on proper hypothesis-testing procedures are clarified. In particular, it is shown that the differences in the correlation structure of each set of genes can lead to a biased comparison among gene sets unless a normalization procedure is applied. We propose statistical tests for two important but different aspects of association for each group of genes. This approach has more statistical power than currently available methods and can result in the discovery of statistically significant pathways that are not detected by other methods. This method is applied to data sets involving diabetes, inflammatory myopathies, and Alzheimer9s disease, using gene sets we compiled from various public databases. In the case of inflammatory myopathies, we have correctly identified the known cytotoxic T lymphocyte-mediated autoimmunity in inclusion body myositis. Furthermore, we predicted the presence of dendritic cells in inclusion body myositis and of an IFN-α/β response in dermatomyositis, neither of which was previously described. These predictions have been subsequently corroborated by immunohistochemistry.Keywords
This publication has 24 references indexed in Scilit:
- Interferon‐α/β–mediated innate immune mechanisms in dermatomyositisAnnals of Neurology, 2005
- Modeling mitochondrial function in aging neuronsTrends in Neurosciences, 2004
- Statistical concerns about the GSEA procedureNature Genetics, 2004
- Measurement of Gelatinase B (MMP-9) in the Cerebrospinal Fluid of Patients With Vascular Dementia and Alzheimer DiseaseStroke, 2004
- Calculating the Statistical Significance of Changes in Pathway Activity From Gene Expression DataStatistical Applications in Genetics and Molecular Biology, 2004
- Statistical significance for genomewide studiesProceedings of the National Academy of Sciences, 2003
- Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-TranslateNucleic Acids Research, 2003
- PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetesNature Genetics, 2003
- GenMAPP, a new tool for viewing and analyzing microarray data on biological pathwaysNature Genetics, 2002
- Significance analysis of microarrays applied to the ionizing radiation responseProceedings of the National Academy of Sciences, 2001