A Theoretical Limit to Coding Space in Chromosomes of Bacteria

Abstract
A mathematical model of cluster patterns for mapped genes with known phenotypes in Escherichia coli predicted that functional genes may account for a maximum of two-thirds of the total chromosomal space. The corollary prediction was that one-third of the chromosome comprised noncoding space. Open reading frame (ORF) analyses for 15 phylogenetically diverse bacterial genomes and for 30 fully sequenced prokaryotic genomes supported the gene cluster model prediction of a two-thirds tendency for coding space. Our results suggest that only 3-4% of unassigned ORFs in E. coli represent genes with potential phenotype and that ORFs marking novel genes in prokaryotes are far fewer than previously thought.