EpoDB: a prototype database for the analysis of genes expressed during vertebrate erythropoiesis
Open Access
- 1 January 1999
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 27 (1) , 200-203
- https://doi.org/10.1093/nar/27.1.200
Abstract
EpoDB is a database of genes expressed in vertebrate red blood cells. It is also a prototype for the creation of cell and tissue-specific databases from multiple external sources. The information in EpoDB obtained from GenBank, SWISS-PROT, Transfac, TRRD and GERD is curated to provide high quality data for sequence analysis aimed at understanding gene regulation during erythropoiesis. New protocols have been developed for data integration and updating entries. Using a BLAST-based algorithm, we have grouped GenBank entries representing the same gene together. This sequence similarity protocol was also used to identify new entries to be included in EpoDB. We have recently implemented our database in Sybase (relational tables) in addition to SICStus Prolog to provide us with greater flexibility in asking complex queries that utilize information from multiple sources. New additions to the public web site (http://www.cbil.upenn.edu/epodb) for accessing EpoDB are the ability to retrieve groups of entries representing different variants of the same gene and to retrieve gene expression data. The BLAST query has been enhanced by incorporating BLASTView, an interactive and graphical display of BLAST results. We have also enhanced the queries for retrieving sequence from specified genes by the addition of MEME, a motif discovery tool, to the integrated analysis tools which include CLUSTALW and TESS.Keywords
This publication has 8 references indexed in Scilit:
- The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998Nucleic Acids Research, 1998
- Databases on transcriptional regulation: TRANSFAC, TRRD and COMPELNucleic Acids Research, 1998
- EpoDB: a database of genes expressed during vertebrate erythropoiesisNucleic Acids Research, 1998
- GenBankNucleic Acids Research, 1998
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Gene Structure Prediction by Linguistic MethodsGenomics, 1994
- QGB: A System for Querying Sequence Database Fields and FeaturesJournal of Computational Biology, 1994
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990