CG dinucleotide clusters in MHC genes and in 5′ demethylated genes

Abstract
In the DNA of higher vertebrates the dinucleotide CG is unique in two respects: it occurs far less frequently than would be expected on the basis of the content of cytosine and guanine in a given DNA segment ("CG suppression") and it contains predominantly 5-methyl-cytosine, the only modified nucleotide common in vertebrate DNA. Here we point out the existence of CG clusters, i.e. localized lapses in the usual CG suppression, in two categories of DNA segments from vertebrates: around the polymorphic exons of major histocompatibility complex (MHC) genes and in the 5' regions of certain other genes. These observations contradict the recent suggestion that CG frequency is uniform over long contiguous segments of DNA containing several genes. A model for the origin of these CG clusters as a consequence of regional demethylation of germline DNA is supported by analysis of other sequence features of these regions as well as by previously published data on the methylation status in sperm DNA of two of these CG-rich regions.