Comparative sequence analysis of the MECP2-locus in human and mouse reveals new transcribed regions

Abstract
Comparative sequence analysis facilitates the identification of evolutionarily conserved regions, that is, gene-regulatory elements, which can not be detected by analyzing one species only. Sequencing of a 152-kb region on human Chromosome (Chr) Xq28 and of the synthenic 123 kb on mouse Chr XC identified the MECP2/Mecp2 locus, which is flanked by the gene coding for Interleukin-1 receptor associated kinase (IRAK/Il1rak) and the red opsin gene (RCP/Rsvp). By comparative sequence analysis, we identified a previously unknown, non-coding 5′ exon embedded in a CpG island associated with MECP2/Mecp2. Thus, the MECP2/Mecp2 gene is comprised of four exons instead of three. Furthermore, sequence comparison 3′ to the previously reported polyadenylation signal revealed a highly conserved region of 8.5 kb terminating in an alternative polyadenylation signal. Northern blot analysis verified the existence of two main transcripts of 1.9 kb and ∼10 kb, respectively. Both transcripts exhibit tissue-specific expression patterns and have almost identical short half-lifes. The ∼10-kb transcript corresponds to a giant 3′ UTR contained in the fourth exon of MECP2. The long 3′ UTR and the newly identified first intron of MECP2/Mecp2 are highly conserved in human and mouse. Furthermore, the human MECP2 locus is heterogeneous with respect to its DNA composition. We postulate that it represents a boundary between two H3 isochores that has not been observed previously.