EGENES: Transcriptome-Based Plant Database of Genes with Metabolic Pathway Information and Expressed Sequence Tag Indices in KEGG
Open Access
- 27 April 2007
- journal article
- Published by Oxford University Press (OUP) in Plant Physiology
- Vol. 144 (2) , 857-866
- https://doi.org/10.1104/pp.106.095059
Abstract
EGENES is a knowledge-based database for efficient analysis of plant expressed sequence tags (ESTs) that was recently added to the KEGG suite of databases. It links plant genomic information with higher order functional information in a single database. It also provides gene indices for each genome. The genomic information in EGENES is a collection of EST contigs constructed from assembly of ESTs. Due to the extremely large genomes of plant species, the bulk collection of data such as ESTs is a quick way to capture a complete repertoire of genes expressed in an organism. Using ESTs for reconstructing metabolic pathways is a new expansion in KEGG and provides researchers with a new resource for species in which only EST sequences are available. Functional annotation in EGENES is a process of linking a set of genes/transcripts in each genome with a network of interacting molecules in the cell. EGENES is a multispecies, integrated resource consisting of genomic, chemical, and network information containing a complete set of building blocks (genes and molecules) and wiring diagrams (biological pathways) to represent cellular functions. Using EGENES, genome-based pathway annotation and EST-based annotation can now be compared and mutually validated. The ultimate goals of EGENES will be to: bring new plant species into KEGG by clustering and annotating ESTs; abstract knowledge and principles from large-scale plant EST data; and improve computational prediction of systems of higher complexity. EGENES will be updated at least once a year. EGENES is publicly available and is accessible by the following link or by KEGG's navigation system (http://www.genome.jp/kegg-bin/create_kegg_menu?category=plants_egenes).Keywords
This publication has 27 references indexed in Scilit:
- EGassembler: online bioinformatics service for large-scale processing, clustering and assembling ESTs and genomic DNA fragmentsNucleic Acids Research, 2006
- Oryzabase. An Integrated Biological and Genome Information Database for RicePlant Physiology, 2006
- GrainGenes 2.0. An Improved Resource for the Small-Grains CommunityPlant Physiology, 2005
- CLU: A new algorithm for EST clusteringBMC Bioinformatics, 2005
- Plant Genome Resources at the National Center for Biotechnology InformationPlant Physiology, 2005
- MetaCyc and AraCyc. Metabolic Pathway Databases for Plant ResearchPlant Physiology, 2005
- Plant Protein Annotation in the UniProt KnowledgebasePlant Physiology, 2005
- ECgene: Genome-based EST clustering and gene modeling for alternative splicingGenome Research, 2005
- AraCyc: A Biochemical Pathway Database for ArabidopsisPlant Physiology, 2003
- Gramene, a Tool for Grass GenomicsPlant Physiology, 2002