Characterization of species-specific genes using a flexible, web-based querying system.
Open Access
- 1 August 2003
- journal article
- Published by Oxford University Press (OUP) in FEMS Microbiology Letters
- Vol. 225 (2) , 213-220
- https://doi.org/10.1016/s0378-1097(03)00512-3
Abstract
We describe a query-based web-accessible system (www.neurogadgets.com/bws.php) for facilitating comparative microbial genomics. A variety of query pages are available, each with numerous options, that allow a biologist to pose relevant questions of genomic data. We illustrate with a characterization of species-specific protein-coding genes (so-called "ORFans"), finding that they are on average smaller, faster evolving, and less G+C-rich, and that they encode proteins more basic in their predicted isoelectric point, compared with non-species-specific genes. Using a dual-threshold approach, we conclude that these are characteristics of true species-specific genes, rather than artifacts of mis-annotation.Keywords
This publication has 29 references indexed in Scilit:
- The PEDANT genome databaseNucleic Acids Research, 2003
- The KEGG databases at GenomeNetNucleic Acids Research, 2002
- The COG database: new developments in phylogenetic classification of proteins from complete genomesNucleic Acids Research, 2001
- The Comprehensive Microbial ResourceNucleic Acids Research, 2001
- Gene content and organization of a 281-kbp contig from the genome of the extremely thermophilic archaeon,Sulfolobus solfataricusP2Genome, 2000
- WIT: integrated system for high-throughput genome sequence analysis and metabolic reconstructionNucleic Acids Research, 2000
- Automated genome sequence analysis and annotation.Bioinformatics, 1999
- Indigo: a World-Wide-Web review of genomes and gene functionsFEMS Microbiology Reviews, 1998
- A Genomic Perspective on Protein FamiliesScience, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997