Evolution of Genomic Content in the Stepwise Emergence ofEscherichia coliO157:H7

Abstract
Genome comparisons have demonstrated that dramatic genetic change often underlies the emergence of new bacterial pathogens. Evolutionary analysis of Escherichia coli O157:H7, a pathogen that has emerged as a worldwide public health threat in the past two decades, has posited that this toxin-producing pathogen evolved in a series of steps from O55:H7, a recent ancestor of a nontoxigenic pathogenic clone associated with infantile diarrhea. We used comparative genomic hybridization with 50-mer oligonucleotide microarrays containing probes from both pathogenic and nonpathogenic genomes to infer when genes were acquired and lost. Many ancillary virulence genes identified in the O157 genome were already present in an O55:H7-like progenitor, with 27 of 33 genomic islands of >5 kb and specific for O157:H7 (O islands) that were acquired intact before the split from this immediate ancestor. Most (85%) of variably absent or present genes are part of prophages or phage-like elements. Divergence in gene content among these closely related strains was ∼140 times greater than divergence at the nucleotide sequence level. A >100-kb region around the O-antigen gene cluster contained highly divergent sequences and also appears to be duplicated in its entirety in one lineage, suggesting that the whole region was cotransferred in the antigenic shift from O55 to O157. The β-glucuronidase-positive O157 variants, although phylogenetically closest to the Sakai strain, were divergent for multiple adherence factors. These observations suggest that, in addition to gains and losses of phage elements, O157:H7 genomes are rapidly diverging and radiating into new niches as the pathogen disseminates.