A Web-based classification system of DNA-binding protein families.
Open Access
- 1 July 2001
- journal article
- research article
- Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection
- Vol. 14 (7) , 465-472
- https://doi.org/10.1093/protein/14.7.465
Abstract
Rational classification of proteins encoded in sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The family of DNA-binding proteins is one of the most populated and studied amongst the various genomes of bacteria, archaea and eukaryotes and the Web-based system presented here is an approach to their classification. The DnaProt resource is an annotated and searchable collection of protein sequences for the families of DNA-binding proteins. The database contains 3238 full-length sequences (retrieved from the SWISS-PROT database, release 38) that include, at least, a DNA-binding domain. Sequence entries are organized into families defined by PROSITE patterns, PRINTS motifs and de novo excised signatures. Combining global similarities and functional motifs into a single classification scheme, DNA-binding proteins are classified into 33 unique classes, which helps to reveal comprehensive family relationships. To maximize family information retrieval, DnaProt contains a collection of multiple alignments for each DNA-binding family while the recognized motifs can be used as diagnostically functional fingerprints. All available structural class representatives have been referenced. The resource was developed as a Web-based management system for online free access of customized data sets. Entries are fully hyperlinked to facilitate easy retrieval of the original records from the source databases while functional and phylogenetic annotation will be applied to newly sequenced genomes. The database is freely available for online search of a library containing specific patterns of the identified DNA-binding protein classes and retrieval of individual entries from our WWW server (http://kronos.biol.uoa.gr/~mariak/dbDNA.html).Keywords
This publication has 28 references indexed in Scilit:
- InterPro—an integrated documentation resource for protein families, domains and functional sitesBioinformatics, 2000
- Solution NMR Structure and Backbone Dynamics of the Major Cold-Shock Protein (CspA) from Escherichia coli: Evidence for Conformational Dynamics in the Single-Stranded RNA-Binding Site,Biochemistry, 1998
- The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1998Nucleic Acids Research, 1998
- The PRINTS protein fingerprint database in its fifth yearNucleic Acids Research, 1998
- Physical basis of a protein-DNA recognition codeCurrent Opinion in Structural Biology, 1997
- The PROSITE database, its status in 1997Nucleic Acids Research, 1997
- From genome sequences to protein functionCurrent Opinion in Structural Biology, 1994
- The solution structure of the Oct-1 POU-specific domain reveals a striking similarity to the bacteriophage λ repressor DNA-binding domainCell, 1993
- Basic local alignment search toolJournal of Molecular Biology, 1990
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977