DAVID Knowledgebase: a gene-centered database integrating heterogeneous gene annotation resources to facilitate high-throughput gene functional analysis

Top Cited Papers

Open Access

2 November 2007

journal article
database
Published by Springer Nature in BMC Bioinformatics

Vol. 8 (1) , 1-11
https://doi.org/10.1186/1471-2105-8-426

Abstract

Background: Due to the complex and distributed nature of biological research, our current biological knowledge is spread over many redundant annotation databases maintained by many independent groups. Analysts usually need to visit many of these bioinformatics databases in order to integrate comprehensive annotation information for their genes, which becomes one of the bottlenecks, particularly for the analytic task associated with a large gene list. Thus, a highly centralized and ready-to-use gene-annotation knowledgebase is in demand for high throughput gene functional analysis. Description: The DAVID Knowledgebase is built around the DAVID Gene Concept, a single-linkage method to agglomerate tens of millions of gene/protein identifiers from a variety of public genomic resources into DAVID gene clusters. The grouping of such identifiers improves the cross-reference capability, particularly across NCBI and UniProt systems, enabling more than 40 publicly available functional annotation sources to be comprehensively integrated and centralized by the DAVID gene clusters. The simple, pair-wise, text format files which make up the DAVID Knowledgebase are freely downloadable for various data analysis uses. In addition, a well organized web interface allows users to query different types of heterogeneous annotations in a high-throughput manner. Conclusion: The DAVID Knowledgebase is designed to facilitate high throughput gene functional analysis. For a given gene list, it not only provides the quick accessibility to a wide range of heterogeneous annotation data in a centralized location, but also enriches the level of biological information for an individual gene. Moreover, the entire DAVID Knowledgebase is freely downloadable or searchable at http://david.abcc.ncifcrf.gov/knowledgebase/.

Keywords

This publication has 21 references indexed in Scilit:

IDconverter and IDClight: Conversion and annotation of gene and protein IDs
BMC Bioinformatics, 2007
ErmineJ: Tool for functional analysis of gene expression data sets
BMC Bioinformatics, 2005
Exploring relationships and mining data with the UCSC Gene Sorter: Figure 1.
Genome Research, 2005
Entrez Gene: gene-centered information at NCBI
Nucleic Acids Research, 2004
EnsMart: A Generic System for Fast and Flexible Access to Biological Data
Genome Research, 2004
UniProt: the Universal Protein knowledgebase
Nucleic Acids Research, 2004
GoMiner: a resource for biological interpretation of genomic and proteomic data
Genome Biology, 2003
SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data
Nucleic Acids Research, 2003
The Protein Information Resource
Nucleic Acids Research, 2003
RESOURCERER: a database for annotating and linking microarray resources within and across species
Genome Biology, 2001