A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data
Open Access
- 18 November 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 38 (3) , e17
- https://doi.org/10.1093/nar/gkp942
Abstract
Illumina BeadArrays are among the most popular and reliable platforms for gene expression profiling. However, little external scrutiny has been given to the design, selection and annotation of BeadArray probes, which is a fundamental issue in data quality and interpretation. Here we present a pipeline for the complete genomic and transcriptomic re-annotation of Illumina probe sequences, also applicable to other platforms, with its output available through a Web interface and incorporated into Bioconductor packages. We have identified several problems with the design of individual probes and we show the benefits of probe re-annotation on the analysis of BeadArray gene expression data sets. We discuss the importance of aspects such as probe coverage of individual transcripts, alternative messenger RNA splicing, single-nucleotide polymorphisms, repeat sequences, RNA degradation biases and probes targeting genomic regions with no known transcription. We conclude that many of the Illumina probes have unreliable original annotation and that our re-annotation allows analyses to focus on the good quality probes, which form the majority, and also to expand the scope of biological information that can be extracted.Keywords
This publication has 67 references indexed in Scilit:
- Statistical methods of background correction for Illumina BeadArray dataBioinformatics, 2009
- Alternative isoform regulation in human tissue transcriptomesNature, 2008
- BASH: a tool for managing BeadArray spatial artefactsBioinformatics, 2008
- Comprehensive genomic characterization defines human glioblastoma genes and core pathwaysNature, 2008
- Regulatory networks define phenotypic classes of human stem cell linesNature, 2008
- A second generation human haplotype map of over 3.1 million SNPsNature, 2007
- Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytesNature Genetics, 2007
- Using RNA sample titrations to assess microarray platform performance and normalization techniquesNature Biotechnology, 2006
- The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurementsNature Biotechnology, 2006
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002