Babel's tower revisited: a universal resource for cross-referencing across annotation databases
Open Access
- 29 June 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 22 (23) , 2934-2939
- https://doi.org/10.1093/bioinformatics/btl372
Abstract
Motivation: Annotation databases are widely used as public repositories of biological knowledge. However, most of these resources have been developed by independent groups which used different designs and different identifiers for the same biological entities. As we show in this article, incoherent name spaces between various databases represent a serious impediment to using the existing annotations at their full potential. Navigating between various such name spaces by mapping IDs from one database to another is a very important issue which is not properly addressed at the moment. Results: We have developed a web-based resource, Onto-Translate (OT), which effectively addresses this problem. OT is able to map onto each other different types of biological entities from the following annotation databases: Swiss-Prot, TrEMBL, NREF, PIR, Gene Ontology, KEGG, Entrez Gene, GenBank, GenPept, IMAGE, RefSeq, UniGene, OMIM, PDB, Eukaryotic Promoter Database, HUGO Gene Nomenclature Committee and NetAffx. Currently, OT is able to perform 462 types of mappings between 29 different types of IDs from 17 databases concerning 53 organisms. Among these, over 300 types of translations and 15 types of IDs are not currently supported by any other tool or resource. On average, OT is able to correctly map between 96 and 99% of the biological entities provided as input. In terms of speed, sets of ∼20 000 IDs can be translated in Availability: OT is a part of Onto-Tools, which is freely available at Contact:sorin@wayne.eduKeywords
This publication has 27 references indexed in Scilit:
- A highly sensitive selection method for directed evolution of homing endonucleasesNucleic Acids Research, 2005
- Ontological analysis of gene expression data: current tools, limitations, and open problemsBioinformatics, 2005
- A GFP-based reporter system to monitor nonsense-mediated mRNA decayNucleic Acids Research, 2005
- Entrez Gene: gene-centered information at NCBINucleic Acids Research, 2004
- The Database of Interacting Proteins: 2004 updateNucleic Acids Research, 2004
- Global functional profiling of gene expression☆☆This work was funded in part by a Sun Microsystems grant awarded to S.D., NIH Grant HD36512 to S.A.K., a Wayne State University SOM Dean’s Post-Doctoral Fellowship, and an NICHD Contraception and Infertility Loan to G.C.O. Support from the WSU MCBI mode is gratefully appreciated.Genomics, 2003
- Profiling Gene Expression Using Onto-ExpressGenomics, 2002
- The KEGG databases at GenomeNetNucleic Acids Research, 2002
- The Protein Data BankNucleic Acids Research, 2000
- dbEST — database for “expressed sequence tags”Nature Genetics, 1993