A Genomic Perspective on Protein Families
Open Access
- 24 October 1997
- journal article
- review article
- Published by American Association for the Advancement of Science (AAAS) in Science
- Vol. 278 (5338) , 631-637
- https://doi.org/10.1126/science.278.5338.631
Abstract
In order to extract the maximum amount of information from the rapidly accumulating genome sequences, all conserved genes need to be classified according to their homologous relationships. Comparison of proteins encoded in seven complete genomes from five major phylogenetic lineages and elucidation of consistent patterns of sequence similarities allowed the delineation of 720 clusters of orthologous groups (COGs). Each COG consists of individual orthologous proteins or orthologous sets of paralogs from at least three lineages. Orthologs typically have the same function, allowing transfer of functional information from one member to an entire COG. This relation automatically yields a number of functional predictions for poorly characterized genomes. The COGs comprise a framework for functional and evolutionary genome analysis.Keywords
This publication has 54 references indexed in Scilit:
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Complete genome sequences of cellular life forms: glimpses of theoretical evolutionary genomicsCurrent Opinion in Genetics & Development, 1996
- Phylogenetic trees: Whither microbiology?Current Biology, 1996
- Sequencing and analysis of bacterial genomesCurrent Biology, 1996
- Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coliCurrent Biology, 1996
- Sequence Analysis of the Genome of the Unicellular Cyanobacterium Synechocystis sp. Strain PCC6803. II. Sequence Determination of the Entire Genome and Assignment of Potential Protein-coding RegionsDNA Research, 1996
- Retention of CD44 introns in bladder cancer: Understanding the alternative splicing of pre‐mRNA opens new insights into the pathogenesis of human cancersThe Journal of Pathology, 1995
- Ancient Conserved Regions in New Gene Sequences and the Protein DatabasesScience, 1993
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977