Large scale data mining approach for gene-specific standardization of microarray gene expression data
Open Access
- 10 October 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 22 (23) , 2898-2904
- https://doi.org/10.1093/bioinformatics/btl500
Abstract
Motivation: The identification of the change of gene expression in multifactorial diseases, such as breast cancer is a major goal of DNA microarray experiments. Here we present a new data mining strategy to better analyze the marginal difference in gene expression between microarray samples. The idea is based on the notion that the consideration of gene's behavior in a wide variety of experiments can improve the statistical reliability on identifying genes with moderate changes between samples. Results: The availability of a large collection of array samples sharing the same platform in public databases, such as NCBI GEO, enabled us to re-standardize the expression intensity of a gene using its mean and variation in the wide variety of experimental conditions. This approach was evaluated via the re-identification of breast cancer-specific gene expression. It successfully prioritized several genes associated with breast tumor, for which the expression difference between normal and breast cancer cells was marginal and thus would have been difficult to recognize using conventional analysis methods. Maximizing the utility of microarray data in the public database, it provides a valuable tool particularly for the identification of previously unrecognized disease-related genes. Availability: A user friendly web-interface ( ) was constructed to provide the present large-scale approach for the analysis of GEO microarray data (GS-LAGE server). Contact:yoonsj@sookmyung.ac.krKeywords
This publication has 20 references indexed in Scilit:
- RAD50 and NBS1 are breast cancer susceptibility genes associated with genomic instabilityCarcinogenesis: Integrative Cancer Research, 2005
- NCBI GEO: mining millions of expression profiles--database and toolsNucleic Acids Research, 2004
- Rank products: a simple, yet powerful, new method to detect differentially regulated genes in replicated microarray experimentsPublished by Wiley ,2004
- A benchmark for Affymetrix GeneChip expression measuresBioinformatics, 2004
- affy—analysis of Affymetrix GeneChip data at the probe levelBioinformatics, 2004
- PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetesNature Genetics, 2003
- Summaries of Affymetrix GeneChip probe level dataNucleic Acids Research, 2003
- Gene Expression Omnibus: NCBI gene expression and hybridization array data repositoryNucleic Acids Research, 2002
- Multiclass cancer diagnosis using tumor gene expression signaturesProceedings of the National Academy of Sciences, 2001
- Model-based analysis of oligonucleotide arrays: Expression index computation and outlier detectionProceedings of the National Academy of Sciences, 2000