CDD: a Conserved Domain Database for the functional annotation of proteins

Top Cited Papers

Open Access

24 November 2010

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 39 (Database) , D225-D229
https://doi.org/10.1093/nar/gkq1189

Abstract

NCBI’s Conserved Domain Database (CDD) is a resource for the annotation of protein sequences with the location of conserved domain footprints, and functional sites inferred from these footprints. CDD includes manually curated domain models that make use of protein 3D structure to refine domain models and provide insights into sequence/structure/function relationships. Manually curated models are organized hierarchically if they describe domain families that are clearly related by common descent. As CDD also imports domain family models from a variety of external sources, it is a partially redundant collection. To simplify protein annotation, redundant models and models describing homologous families are clustered into superfamilies. By default, domain footprints are annotated with the corresponding superfamily designation, on top of which specific annotation may indicate high-confidence assignment of family membership. Pre-computed domain annotation is available for proteins in the Entrez/Protein dataset, and a novel interface, Batch CD-Search, allows the computation and download of annotation for large sets of protein queries. CDD can be accessed via http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml .

Keywords

NATIONAL LIBRARY OF MEDICINE (U.S.)

This publication has 10 references indexed in Scilit:

The Pfam protein families database
Nucleic Acids Research, 2009
Database resources of the National Center for Biotechnology Information
Nucleic Acids Research, 2009
CDD: specific functional annotation with the Conserved Domain Database
Nucleic Acids Research, 2009
Protein subfamily assignment using the Conserved Domain Database
BMC Research Notes, 2008
TIGRFAMs and Genome Properties: tools for the assignment of molecular function and biological process in prokaryotic genomes
Nucleic Acids Research, 2006
SMART 5: domains in the context of genomes and networks
Nucleic Acids Research, 2006
CD-Search: protein domain annotations on the fly
Nucleic Acids Research, 2004
The COG database: an updated version includes eukaryotes
BMC Bioinformatics, 2003
CDD: a database of conserved domain alignments with links to domain three-dimensional structure
Nucleic Acids Research, 2002
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997