A robust neural networks approach for spatial and intensity-dependent normalization of cDNA microarray data
Open Access
- 29 March 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (11) , 2674-2683
- https://doi.org/10.1093/bioinformatics/bti397
Abstract
Motivation: Microarray experiments are affected by numerous sources of non-biological variation that contribute systematic bias to the resulting data. In a dual-label (two-color) cDNA or long-oligonucleotide microarray, these systematic biases are often manifested as an imbalance of measured fluorescent intensities corresponding to Sample A versus those corresponding to Sample B. Systematic biases also affect between-slide comparisons. Making effective corrections for these systematic biases is a requisite for detecting the underlying biological variation between samples. Effective data normalization is therefore an essential step in the confident identification of biologically relevant differences in gene expression profiles. Several normalization methods for the correction of systemic bias have been described. While many of these methods have addressed intensity-dependent bias, few have addressed both intensity-dependent and spatiality-dependent bias. Results: We present a neural network-based normalization method for correcting the intensity- and spatiality-dependent bias in cDNA microarray datasets. In this normalization method, the dependence of the log-intensity ratio (M) on the average log-intensity (A) as well as on the spatial coordinates (X,Y) of spots is approximated with a feed-forward neural network function. Resistance to outliers is provided by assigning weights to each spot based on how distant their M values is from the median over the spots whose A values are similar, as well as by using pseudospatial coordinates instead of spot row and column indices. A comparison of the robust neural network method with other published methods demonstrates its potential in reducing both intensity-dependent bias and spatial-dependent bias, which translates to more reliable identification of truly regulated genes. Availability: The normalization method described in this paper is available as the library nnNorm in the BioConductor project (http://www.bioconductor.org). Scripts used to load the freely available data and generate some of the figures in this paper are available in the documentation accompanying this library. Contact:ltarca@rsvs.ulaval.caKeywords
This publication has 15 references indexed in Scilit:
- Analysis of variance components in gene expression dataBioinformatics, 2004
- Empirical evaluation of data transformations and ranking statistics for microarray analysisNucleic Acids Research, 2004
- Normalization of cDNA microarray dataMethods, 2003
- New normalization methods for cDNA microarray dataBioinformatics, 2003
- Microarray data normalization and transformationNature Genetics, 2002
- Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variationNucleic Acids Research, 2002
- Significance analysis of microarrays applied to the ionizing radiation responseProceedings of the National Academy of Sciences, 2001
- Comprehensive Identification of Cell Cycle–regulated Genes of the YeastSaccharomyces cerevisiaeby Microarray HybridizationMolecular Biology of the Cell, 1998
- R: A Language for Data Analysis and GraphicsJournal of Computational and Graphical Statistics, 1996
- Approximation capabilities of multilayer feedforward networksNeural Networks, 1991