Filtering genes to improve sensitivity in oligonucleotide microarray data analysis
Open Access
- 15 August 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 35 (16) , e102
- https://doi.org/10.1093/nar/gkm537
Abstract
Many recent microarrays hold an enormous number of probe sets, thus raising many practical and theoretical problems in controlling the false discovery rate (FDR). Biologically, it is likely that most probe sets are associated with un-expressed genes, so the measured values are simply noise due to non-specific binding; also many probe sets are associated with non-differentially-expressed (non-DE) genes. In an analysis to find DE genes, these probe sets contribute to the false discoveries, so it is desirable to filter out these probe sets prior to analysis. In the methodology proposed here, we first fit a robust linear model for probe-level Affymetrix data that accounts for probe and array effects. We then develop a novel procedure called FLUSH (Filtering Likely Uninformative Sets of Hybridizations), which excludes probe sets that have statistically small array-effects or large residual variance. This filtering procedure was evaluated on a publicly available data set from a controlled spiked-in experiment, as well as on a real experimental data set of a mouse model for retinal degeneration. In both cases, FLUSH filtering improves the sensitivity in the detection of DE genes compared to analyses using unfiltered, presence-filtered, intensity-filtered and variance-filtered data. A freely-available package called FLUSH implements the procedures and graphical displays described in the article.Keywords
This publication has 34 references indexed in Scilit:
- Feature-level exploration of a published Affymetrix GeneChip control dataset.Genome Biology, 2006
- Gene Expression Patterns That Characterize Advanced Stage Serous Ovarian CancersJournal of the Society for Gynecologic Investigation, 2004
- Gene expression profiling of Duchenne muscular dystrophy skeletal muscleneurogenetics, 2003
- Comprehensive sampling of gene expression in human cell lines with massively parallel signature sequencingProceedings of the National Academy of Sciences, 2003
- Summaries of Affymetrix GeneChip probe level dataNucleic Acids Research, 2003
- Quality Indicators Increase the Reliability of Microarray DataGenomics, 2002
- Large-scale analysis of the human and mouse transcriptomesProceedings of the National Academy of Sciences, 2002
- A chronic inflammatory response dominates the skeletal muscle molecular signature in dystrophin-deficient mdx miceHuman Molecular Genetics, 2002
- Retinal degeneration mutants in the mouseVision Research, 2002
- Identification by Array Screening of Altered nm23-M2/PuF mRNA Expression in Mouse Retinal DegenerationMolecular Cell Biology Research Communications, 2000