Long-Range Periodic Patterns in Microbial Genomes Indicate Significant Multi-Scale Chromosomal Organization

Abstract
Genome organization can be studied through analysis of chromosome position-dependent patterns in sequence-derived parameters. A comprehensive analysis of such patterns in prokaryotic sequences and genome-scale functional data has yet to be performed. We detected spatial patterns in sequence-derived parameters for 163 chromosomes occurring in 135 bacterial and 16 archaeal organisms using wavelet analysis. Pattern strength was found to correlate with organism-specific features such as genome size, overall GC content, and the occurrence of known motility and chromosomal binding proteins. Given additional functional data for Escherichia coli, we found significant correlations among chromosome position dependent patterns in numerous properties, some of which are consistent with previously experimentally identified chromosome macrodomains. These results demonstrate that the large-scale organization of most sequenced genomes is significantly nonrandom, and, moreover, that this organization is likely linked to genome size, nucleotide composition, and information transfer processes. Constraints on genome evolution and design are thus not solely dependent upon information content, but also upon an intricate multi-parameter, multi-length-scale organization of the chromosome. For more than a decade, the genetic material for a growing number of microbial organisms has been determined experimentally using genome sequencing techniques. These sequenced genomes provide researchers with an abundance of information regarding the composition and capabilities of each organism since they serve as “parts lists” that specify the protein machinery that each cell generates. However, genomes are not merely “lists” but also are typically arranged in nonrandom order. It is thought that this order may be related to some extent to the way in which each genome is packed into the tiny confines of a cell (often more than 1,000-fold packing). The authors have used signal processing methods to identify long-range spatial patterns in the arrangement of most sequenced microbial genomes, and they have related the degree of organization in each genome to various characteristics specific to the corresponding organisms. They have also analyzed in detail the degree of overlap among patterns in numerous different kinds of data for a model bacterial organism, Escherichia coli. Their results conclusively demonstrate that there are significant evolutionary constraints that act upon genome organization as well as genome content, and that the interplay between organization and function cannot be ignored in understanding fundamentally how a microbial cell works.