Power enhancement via multivariate outlier testing with gene expression arrays
Open Access
- 16 November 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 25 (1) , 48-53
- https://doi.org/10.1093/bioinformatics/btn591
Abstract
Motivation: As the use of microarrays in human studies continues to increase, stringent quality assurance is necessary to ensure accurate experimental interpretation. We present a formal approach for microarray quality assessment that is based on dimension reduction of established measures of signal and noise components of expression followed by parametric multivariate outlier testing. Results: We applied our approach to several data resources. First, as a negative control, we found that the Affymetrix and Illumina contributions to MAQC data were free from outliers at a nominal outlier flagging rate of α=0.01. Second, we created a tunable framework for artificially corrupting intensity data from the Affymetrix Latin Square spike-in experiment to allow investigation of sensitivity and specificity of quality assurance (QA) criteria. Third, we applied the procedure to 507 Affymetrix microarray GeneChips processed with RNA from human peripheral blood samples. We show that exclusion of arrays by this approach substantially increases inferential power, or the ability to detect differential expression, in large clinical studies. Availability:http://bioconductor.org/packages/2.3/bioc/html/arrayMvout.html and http://bioconductor.org/packages/2.3/bioc/html/affyContam.html affyContam (credentials: readonly/readonly) Contact:aasare@immunetolerance.org; stvjc@channing.harvard.eduKeywords
This publication has 13 references indexed in Scilit:
- lumi: a pipeline for processing Illumina microarrayBioinformatics, 2008
- MDQC: a new quality assessment method for microarrays based on quality control reportsBioinformatics, 2007
- The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurementsNature Biotechnology, 2006
- Power and sample size estimation in high dimensional biologyStatistical Methods in Medical Research, 2004
- Expression profiling — best practices for data generation and interpretation in clinical trialsNature Reviews Genetics, 2004
- affy—analysis of Affymetrix GeneChip data at the probe levelBioinformatics, 2004
- Summaries of Affymetrix GeneChip probe level dataNucleic Acids Research, 2003
- Sequential Application of Wilks's Multivariate Outlier TestJournal of the Royal Statistical Society Series C: Applied Statistics, 1992
- Percentage Points for a Generalized ESD Many-Outlier ProcedureTechnometrics, 1983
- Percentage Points for a Generalized ESD Many-Outlier ProcedureTechnometrics, 1983