Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning

Abstract
A newly developed method of characterizing an organism's 'methylome', that is the pattern of DNA methylation in the genome, has been used to generate a map of methylated cytosines in Arabidopsis to single base-pair resolution. The procedure, termed BS-Seq, combines bisulphite treatment of genomic DNA with ultra-high-throughput DNA sequencing to achieve a more precise and comprehensive result than previously possible. DNA methylation is an important factor in regulating gene expression, and this method, which can be applied to larger genomes like the mouse as well as to Arabidopsis, could prove a significant advance in the study of this form of gene regulation. In Arabidopsis, a map of methylated cytosines is generated at single base pair resolution by combining bisulphite treatment of genomic DNA with ultra-high-throughput sequencing. Cytosine DNA methylation is important in regulating gene expression and in silencing transposons and other repetitive sequences1,2. Recent genomic studies in Arabidopsis thaliana have revealed that many endogenous genes are methylated either within their promoters or within their transcribed regions, and that gene methylation is highly correlated with transcription levels3,4,5. However, plants have different types of methylation controlled by different genetic pathways, and detailed information on the methylation status of each cytosine in any given genome is lacking. To this end, we generated a map at single-base-pair resolution of methylated cytosines for Arabidopsis, by combining bisulphite treatment of genomic DNA with ultra-high-throughput sequencing using the Illumina 1G Genome Analyser and Solexa sequencing technology6. This approach, termed BS-Seq, unlike previous microarray-based methods, allows one to sensitively measure cytosine methylation on a genome-wide scale within specific sequence contexts. Here we describe methylation on previously inaccessible components of the genome and analyse the DNA methylation sequence composition and distribution. We also describe the effect of various DNA methylation mutants on genome-wide methylation patterns, and demonstrate that our newly developed library construction and computational methods can be applied to large genomes such as that of mouse.