Identification and functional analysis of 'hypothetical' genes expressed in Haemophilus influenzae
- 28 April 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 32 (8) , 2353-2361
- https://doi.org/10.1093/nar/gkh555
Abstract
The progress in genome sequencing has led to a rapid accumulation in GenBank submissions of uncharacterized 'hypothetical' genes. These genes, which have not been experimentally characterized and whose functions cannot be deduced from simple sequence comparisons alone, now comprise a significant fraction of the public databases. Expression analyses of Haemophilus influenzae cells using a combination of transcriptomic and proteomic approaches resulted in confident identification of 54 'hypothetical' genes that were expressed in cells under normal growth conditions. In an attempt to understand the functions of these proteins, we used a variety of publicly available analysis tools. Close homologs in other species were detected for each of the 54 'hypothetical' genes. For 16 of them, exact functional assignments could be found in one or more public databases. Additionally, we were able to suggest general functional characterization for 27 more genes (comprising similar to80% total). Findings from this analysis include the identification of a pyruvate-formate lyase-like operon, likely to be expressed not only in H.influenzae but also in several other bacteria. Further, we also observed three genes that are likely to participate in the transport and/or metabolism of sialic acid, an important component of the H.influenzae lipo-oligosaccharide. Accurate functional annotation of uncharacterized genes calls for an integrative approach, combining expression studies with extensive computational analysis and curation, followed by eventual experimental verification of the computational predictions.Keywords
This publication has 60 references indexed in Scilit:
- Identifying Protein Function—A Call for Community ActionPLoS Biology, 2004
- Statistical analysis of global gene expression data: some practical considerationsCurrent Opinion in Biotechnology, 2004
- Initial Proteome Analysis of Model MicroorganismHaemophilus influenzaeStrain Rd KW20Journal of Bacteriology, 2003
- The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003Nucleic Acids Research, 2003
- H. influenzae Consortium: Integrative Study of H. influenzae-Human InteractionsOMICS: A Journal of Integrative Biology, 2002
- Empirical Statistical Model To Estimate the Accuracy of Peptide Identifications Made by MS/MS and Database SearchAnalytical Chemistry, 2002
- The Department of Energy Microbial Cell Project: A 180° Paradigm Shift for BiologyOMICS: A Journal of Integrative Biology, 2002
- Conserved ‘hypothetical’ proteins: new hints and new puzzlesComparative and Functional Genomics, 2001
- Who's your neighbor? New computational approaches for functional genomicsNature Biotechnology, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997