Complete Pipeline for Infinium ® Human Methylation 450K BeadChip Data Processing Using Subset Quantile Normalization for Accurate DNA Methylation Estimation
Top Cited Papers
- 12 June 2012
- journal article
- research article
- Published by Taylor & Francis in Epigenomics
- Vol. 4 (3) , 325-341
- https://doi.org/10.2217/epi.12.21
Abstract
Background: Huge progress has been made in the development of array- or sequencing-based technologies for DNA methylation ana-lysis. The Illumina Infinium (R) Human Methylation 450K BeadChip (Illumina Inc., CA, USA) allows the simultaneous quantitative monitoring of more than 480,000 CpG positions, enabling large-scale epigenotyping studies. However, the assay combines two different assay chemistries, which may cause a bias in the ana-lysis if all signals are merged as a unique source of methylation measurement. Materials & methods: We confirm in three 450K data sets that Infinium I signals are more stable and cover a wider dynamic range of methylation values than Infinium II signals. We evaluated the methylation profile of Infinium I and II probes obtained with different normalization protocols and compared these results with the methylation values of a subset of CpGs analyzed by pyrosequencing. Results: We developed a subset quantile normalization approach for the processing of 450K BeadChips. The Infinium I signals were used as 'anchors' to normalize Infinium II signals at the level of probe coverage categories. Our normalization approach outperformed alternative normalization or correction approaches in terms of bias correction and methylation signal estimation. We further implemented a complete preprocessing protocol that solves most of the issues currently raised by 450K array users. Conclusion: We developed a complete preprocessing pipeline for 450K BeadChip data using an original subset quantile normalization approach that performs both sample normalization and efficient Infinium I/II shift correction. The scripts, being freely available from the authors, will allow researchers to concentrate on the biological ana-lysis of data, such as the identification of DNA methylation signatures.Keywords
Funding Information
- EU’s Seventh Framework Program ((FP7/2007-2013))
This publication has 25 references indexed in Scilit:
- IMA: an R package for high-throughput analysis of Illumina's 450K Infinium methylation dataBioinformatics, 2012
- High density DNA methylation array with single CpG site resolutionGenomics, 2011
- Epigenome-wide association studies for common human diseasesNature Reviews Genetics, 2011
- Epigenetic Alterations as Cancer Diagnostic, Prognostic, and Predictive BiomarkersAdvances in Genetics, 2010
- Accurate genome-scale percentage DNA methylation estimates from microarray dataBiostatistics, 2010
- Quantitative comparison of genome-wide DNA methylation mapping technologiesNature Biotechnology, 2010
- Independent filtering increases detection power for high-throughput experimentsProceedings of the National Academy of Sciences, 2010
- The human colon cancer methylome shows similar hypo- and hypermethylation at conserved tissue-specific CpG island shoresNature Genetics, 2009
- lumi: a pipeline for processing Illumina microarrayBioinformatics, 2008
- DNA methylation analysis by pyrosequencingNature Protocols, 2007