A Cross-Study Comparison of Gene Expression Studies for the Molecular Classification of Lung Cancer
- 1 May 2004
- journal article
- research article
- Published by American Association for Cancer Research (AACR) in Clinical Cancer Research
- Vol. 10 (9) , 2922-2927
- https://doi.org/10.1158/1078-0432.ccr-03-0490
Abstract
Purpose: Recent studies sought to refine lung cancer classification using gene expression microarrays. We evaluate the extent to which these studies agree and whether results can be integrated. Experimental Design: We developed a practical analysis plan for cross-study comparison, validation, and integration of cancer molecular classification studies using public data. We evaluated genes for cross-platform consistency of expression patterns, using integrative correlations, which quantify cross-study reproducibility without relying on direct assimilation of expression measurements across platforms. We then compared associations of gene expression levels to differential diagnosis of squamous cell carcinoma versus adenocarcinoma via reproducibility of the gene-specific t statistics and to survival via reproducibility of Cox coefficients. Results: Integrative correlation analysis revealed a large proportion of genes in which the patterns agreed across studies more than would be expected by chance. Correlation of t statistics for diagnosis of squamous cell carcinoma versus adenocarcinoma is high (0.85) and increases (0.925) when using only the most consistent genes identified by integrative correlation. Correlations of Cox coefficients ranged from 0.13 to 0.31 (0.33–0.49 with genes selected for consistency). Although we find genes that are significant in multiple studies but show discordant effects, their number is approximately that expected by chance. We report genes that are reproducible by integrative analysis, significant in all studies, and concordant in effect. Conclusions: Cross-study comparison revealed significant, albeit incomplete, agreement of gene expression patterns related to lung cancer biology and identified genes that reproducibly predict outcomes. This analysis approach is broadly applicable to cross-study comparisons of gene expression profiling projects.Keywords
This publication has 10 references indexed in Scilit:
- Summaries of Affymetrix GeneChip probe level dataNucleic Acids Research, 2003
- Gene-expression profiles predict survival of patients with lung adenocarcinomaNature Medicine, 2002
- Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclassesProceedings of the National Academy of Sciences, 2001
- Diversity of gene expression in adenocarcinoma of the lungProceedings of the National Academy of Sciences, 2001
- 'Gene shaving' as a method for identifying distinct sets of genes with similar expression patternsGenome Biology, 2000
- Modeling Survival Data: Extending the Cox ModelPublished by Springer Nature ,2000
- Synexpression groups in eukaryotesNature, 1999
- Cluster analysis and display of genome-wide expression patternsProceedings of the National Academy of Sciences, 1998
- Comprehensive Identification of Cell Cycle–regulated Genes of the YeastSaccharomyces cerevisiaeby Microarray HybridizationMolecular Biology of the Cell, 1998
- Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple TestingJournal of the Royal Statistical Society Series B: Statistical Methodology, 1995