Large-scale open bioinformatics data resources

1 June 2002

journal article
research article

Vol. 4 (3) , 265-274

Abstract

The data explosion in bioinformatics is relentless. More and more genomes are being sequenced and many new types of datasets are being generated in large-scale projects. Integration and true open access to the data are still difficult issues, although they are gradually being addressed. Notably, certain fields have good standardization and interoperability, while others lag behind. This review summarizes the latest developments in genome and sequences databases, transcriptomics data (ESTs, ORESTES, full-length cDNAs), proteomics data (protein databases, protein structures, family and domain classification) as well as loosely integrated fields, such as microarray experiments, mutation databases and databases of regulatory regions and elements. The review attempts to resist simply summarizing what data are available, and aims to provide a critical look at some of the integration and access issues associated with several of these resources.

Keywords

This publication has 13 references indexed in Scilit:

Time for a Unified System of Mutation Description and Reporting: A Review of Locus-Specific Mutation Databases
Genome Research, 2002
Computational Detection and Location of Transcription Start Sites in Mammalian Genomic DNA
Genome Research, 2002
SUPFAM--a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes
Nucleic Acids Research, 2002
SCOP database in 2002: refinements accommodate structural genomics
Nucleic Acids Research, 2002
MODBASE, a database of annotated comparative protein structure models
Nucleic Acids Research, 2002
Recent improvements to the SMART domain-based sequence annotation resource
Nucleic Acids Research, 2002
PRINTS and PRINTS-S shed light on protein ancestry
Nucleic Acids Research, 2002
Database resources of the National Center for Biotechnology Information: 2002 update
Nucleic Acids Research, 2002
FANTOM DB: database of Functional Annotation of RIKEN Mouse cDNA Clones
Nucleic Acids Research, 2002
Assembly, Annotation, and Integration of UNIGENE Clusters into the Human Genome Draft
Genome Research, 2001