Predicting essential genes in fungal genomes
Open Access
- 9 August 2006
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 16 (9) , 1126-1135
- https://doi.org/10.1101/gr.5144106
Abstract
Essential genes are required for an organism's viability, and the ability to identify these genes in pathogens is crucial to directed drug development. Predicting essential genes through computational methods is appealing because it circumvents expensive and difficult experimental screens. Most such prediction is based on homology mapping to experimentally verified essential genes in model organisms. We present here a different approach, one that relies exclusively on sequence features of a gene to estimate essentiality and offers a promising way to identify essential genes in unstudied or uncultured organisms. We identified 14 characteristic sequence features potentially associated with essentiality, such as localization signals, codon adaptation, GC content, and overall hydrophobicity. Using the well-characterized baker's yeast Saccharomyces cerevisiae, we employed a simple Bayesian framework to measure the correlation of each of these features with essentiality. We then employed the 14 features to learn the parameters of a machine learning classifier capable of predicting essential genes. We trained our classifier on known essential genes in S. cerevisiae and applied it to the closely related and relatively unstudied yeast Saccharomyces mikatae. We assessed predictive success in two ways: First, we compared all of our predictions with those generated by homology mapping between these two species. Second, we verified a subset of our predictions with eight in vivo knockouts in S. mikatae, and we present here the first experimentally confirmed essential genes in this species.Keywords
This publication has 34 references indexed in Scilit:
- Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial CommunitiesPLoS Computational Biology, 2005
- The ‘effective number of codons’ revisitedBiochemical and Biophysical Research Communications, 2004
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003
- Functional profiling of the Saccharomyces cerevisiae genomeNature, 2002
- Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. CohenJournal of Molecular Biology, 2001
- A Bayesian system integrating expression data with sequence patterns for localizing proteins: comprehensive application to the yeast genome 1 1Edited by F. CohenJournal of Molecular Biology, 2000
- Do essential genes evolve slowly?Current Biology, 1999
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- Basic local alignment search toolJournal of Molecular Biology, 1990
- The ‘effective number of codons’ used in a geneGene, 1990