A functional survey of the enhancer activity of conserved non-coding sequences from vertebrateIroquoiscluster gene deserts

Abstract
Recent studies of the genome architecture of vertebrates have uncovered two unforeseen aspects of its organization. First, large regions of the genome, called gene deserts, are devoid of protein-coding sequences and have no obvious biological role. Second, comparative genomics has highlighted the existence of an array of highly conserved non-coding regions (HCNRs) in all vertebrates. Most surprisingly, these structural features are strongly associated with genes that have essential functions during development. Among these, the vertebrateIroquois(Irx) genes stand out on both fronts. MammalianIrxgenes are organized in two clusters (IrxAandIrxB) that span >1 Mb each with no other genes interspersed. Additionally, a large number of HCNRs exist withinIrxclusters. We have systematically examined the enhancer activity of HCNRs from theIrxBcluster using transgenicXenopusand zebrafish embryos. Most of these HCNRs are active in subdomains of endogenousIrxexpression, and some are candidates to contain shared enhancers of neighboring genes, which could explain the evolutionary conservation ofIrxclusters. Furthermore, HCNRs present in tetrapodIrxBbut not in fish may be responsible for novelIrxexpression domains that appeared after their divergence. Finally, we have performed a more detailed analysis on twoIrxBultraconserved non-coding regions (UCRs) duplicated inIrxAclusters in similar relative positions. These four regions share a core region highly conserved among all of them and drive expression in similar domains. However, inter-species conserved sequences surrounding the core, specific for each of these UCRs, are able to modulate their expression.