Evidence of a Large Novel Gene Pool Associated with Prokaryotic Genomic Islands

Abstract
Microbial genes that are “novel” (no detectable homologs in other species) have become of increasing interest as environmental sampling suggests that there are many more such novel genes in yet-to-be-cultured microorganisms. By analyzing known microbial genomic islands and prophages, we developed criteria for systematic identification of putative genomic islands (clusters of genes of probable horizontal origin in a prokaryotic genome) in 63 prokaryotic genomes, and then characterized the distribution of novel genes and other features. All but a few of the genomes examined contained significantly higher proportions of novel genes in their predicted genomic islands compared with the rest of their genome (Paired t test = 4.43E-14 to 1.27E-18, depending on method). Moreover, the reverse observation (i.e., higher proportions of novel genes outside of islands) never reached statistical significance in any organism examined. We show that this higher proportion of novel genes in predicted genomic islands is not due to less accurate gene prediction in genomic island regions, but likely reflects a genuine increase in novel genes in these regions for both bacteria and archaea. This represents the first comprehensive analysis of novel genes in prokaryotic genomic islands and provides clues regarding the origin of novel genes. Our collective results imply that there are different gene pools associated with recently horizontally transmitted genomic regions versus regions that are primarily vertically inherited. Moreover, there are more novel genes within the gene pool associated with genomic islands. Since genomic islands are frequently associated with a particular microbial adaptation, such as antibiotic resistance, pathogen virulence, or metal resistance, this suggests that microbes may have access to a larger “arsenal” of novel genes for adaptation than previously thought. More than 250 microbial genomes have been sequenced to date. A significant proportion of the genes in these genomes have no apparent similarity to known genes and their functions are unknown (i.e., they appear to be novel). As the number of sequenced genomes increases, the number of these novel genes continues to increase. In this paper, the authors now show, through an analysis of a diverse range of prokaryotic genomes, that novel genes are more prevalent in regions called genomic islands. Genomic islands are clusters of genes in genomes that show evidence of horizontal origins. This study is notable since genomic islands disproportionately contain many genes of medical, agricultural, and environmental importance (e.g., animal and plant pathogen virulence factors, antibiotic resistance genes, phenolic degradation genes, etc.). The observation that high proportions of novel genes are also localized to genomic islands suggests that microbes may have access to a larger “arsenal” of novel genes for important adaptations than previously thought. These results also imply that there are different gene pools associated with recently horizontally transmitted genomic regions versus regions that are primarily vertically inherited. The authors suggest that further studies involving large-scale environmental genomic sampling are required to help characterize this understudied gene pool.