Computing prokaryotic gene ubiquity: Rescuing the core from extinction
Open Access
- 1 December 2004
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 14 (12) , 2469-2477
- https://doi.org/10.1101/gr.3024704
Abstract
The genomic core concept has found several uses in comparative and evolutionary genomics. Defined as the set of all genes common to (ubiquitous among) all genomes in a phylogenetically coherent group, core size decreases as the number and phylogenetic diversity of the relevant group increases. Here, we focus on methods for defining the size and composition of the core of all genes shared by sequenced genomes of prokaryotes (Bacteria and Archaea). There are few (almost certainly less than 50) genes shared by all of the 147 genomes compared, surely insufficient to conduct all essential functions. Sequencing and annotation errors are responsible for the apparent absence of some genes, while very limited but genuine disappearances (from just one or a few genomes) can account for several others. Core size will continue to decrease as more genome sequences appear, unless the requirement for ubiquity is relaxed. Such relaxation seems consistent with any reasonable biological purpose for seeking a core, but it renders the problem of definition more problematic. We propose an alternative approach (the phylogenetically balanced core), which preserves some of the biological utility of the core concept. Cores, however delimited, preferentially contain informational rather than operational genes; we present a new hypothesis for why this might be so.Keywords
This publication has 39 references indexed in Scilit:
- Genome PhylogeniesPublished by Taylor & Francis ,2004
- Comparative genomics, minimal gene-sets and the last universal common ancestorNature Reviews Microbiology, 2003
- From Gene Trees to Organismal Phylogeny in Prokaryotes:The Case of the γ-ProteobacteriaPLoS Biology, 2003
- The Balance of Driving Forces During Genome Evolution in ProkaryotesGenome Research, 2003
- Tinker, Tailor: Can Venter Stitch Together a Genome From Scratch?Science, 2003
- A Phylogenomic Approach to Bacterial Phylogeny: Evidence of a Core of Genes Sharing a Common HistoryGenome Research, 2002
- Distributional profiles of homologous open reading frames among bacterial phyla: implications for vertical and lateral transmissionInternational Journal of Systematic and Evolutionary Microbiology, 2002
- Mix and Match in the Tree of LifeScience, 1999
- A Genomic Perspective on Protein FamiliesScience, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997