DATABASE: A new forum for biological databases and curation

Abstract
Most computational tools for biologists preferably require data in large amounts. The larger the quantity of data, the more rigorous statistical analyses can support the discovery of new hypotheses for testing in a laboratory. A variety of technological developments during the past two decades have accelerated the rate of deposition of data into databases. Currently there are many public databases where data from, for example, DNA and protein sequences or 3D protein structures, and more complex information types, like ontologies, networks and pathways are deposited, maintained, annotated, curated and stored. Indeed, more recent efforts to store, for example, phenotype (in addition to genotypes) and clinical trials signify a new tendency to gather more complex data types. The data collected in these large public repositories represent valuable and significant resources for ongoing knowledge extraction. Mining of this data using computational tools is an increasingly indispensible part of modern research, and the organized storage of the data in databases is obligatory. Indeed such approaches are likely to have serious impact on the reproducibility of results. Resourceful tools for the establishment, interrogation, rearrangement, display and interpretation of new and large databases are frequently minor points in a publication and are relegated to brief statements in methods sections or in figure legends when the final work is published. However, there are often original and creative computational methods which resulted in these discoveries but which are not communicated in the scientific literature because the description of a database and the tools to interact with it are not deemed essential to the communication.

This publication has 0 references indexed in Scilit: