GreenPhylDB: a database for plant comparative genomics
Open Access
- 5 November 2007
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 36 (Databae) , D991-D998
- https://doi.org/10.1093/nar/gkm934
Abstract
GreenPhylDB ( http://greenphyl.cirad.fr ) is a comprehensive platform designed to facilitate comparative functional genomics in Oryza sativa and Arabidopsis thaliana genomes. The main functions of GreenPhylDB are to assign O. sativa and A. thaliana sequences to gene families using a semi-automatic clustering procedure and to create ‘orthologous’ groups using a phylogenomic approach. To date, GreenPhylDB comprises the most complete list of plant gene families, which have been manually curated (6421 families). GreenPhylDB also contains all of the phylogenomic relationships computed for 4375 families. A total of 492 TAIR, 1903 InterPro and 981 KEGG families and subfamilies were manually curated using the clusters created with the TribeMCL software. GreenPhylDB integrates information from several other databases including UniProt, KEGG, InterPro, TAIR and TIGR. Several entry points can be used to display phylogenomic relationships for A. thaliana or O. sativa sequences, using TAIR, TIGR gene ID, family name, InterPro, gene alias, UniProt or protein/nucleic sequence. Finally, a powerful phylogenomics tool, GreenPhyl Ortholog Search Tool (GOST), was incorporated into GreenPhylDB to predict orthologous relationships between O. sativa / A. thaliana protein(s) and sequences from other plant species.Keywords
This publication has 27 references indexed in Scilit:
- POGs/PlantRBP: a resource for comparative genomics in plantsNucleic Acids Research, 2006
- DRTF: a database of rice transcription factorsBioinformatics, 2006
- The map-based sequence of the rice genomeNature, 2005
- MAFFT version 5: improvement in accuracy of multiple sequence alignmentNucleic Acids Research, 2005
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- Genome-Wide Insertional Mutagenesis of Arabidopsis thalianaScience, 2003
- The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and communityNucleic Acids Research, 2003
- An efficient algorithm for large-scale detection of protein familiesNucleic Acids Research, 2002
- Automatic clustering of orthologs and in-paralogs from pairwise species comparisonsJournal of Molecular Biology, 2001
- A simple algorithm to infer gene duplication and speciation events on a gene treeBioinformatics, 2001