Gene3D: modelling protein structure, function and evolution
Open Access
- 1 January 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 34 (90001) , D281-D284
- https://doi.org/10.1093/nar/gkj057
Abstract
The Gene3D release 4 database and web portal (http://cathwww.biochem.ucl.ac.uk:8080/Gene3D) provide a combined structural, functional and evolutionary view of the protein world. It is focussed on providing structural annotation for protein sequences without structural representatives-including the complete proteome sets of over 240 different species. The protein sequences have also been clustered into whole-chain families so as to aid functional prediction. The structural annotation is generated using HMM models based on the CATH domain families; CATH is a repository formanually deduced protein domains. Amongst the changes from the last publication are: the addition of over 100 genomes and the UniProt sequence database, domain data from Pfam, metabolic pathway and functional data from COGs, KEGG and GO, and protein-protein interaction data from MINT and BIND. The website has been rebuilt to allow more sophisticated querying and the data returned is presented in a clearer format with greater functionality. Furthermore, all data can be downloaded in a simple XML format, allowing users to carry out complex investigations at their own computers.Keywords
This publication has 25 references indexed in Scilit:
- The Universal Protein Resource (UniProt)Nucleic Acids Research, 2006
- Assessing strategies for improved superfamily recognitionProtein Science, 2005
- Identification and distribution of protein families in 120 completed genomes using Gene3DProteins-Structure Function and Bioinformatics, 2005
- Multi-domain Proteins in the Three Kingdoms of Life: Orphan Domains and Other Unassigned RegionsJournal of Molecular Biology, 2005
- Integr8 and Genome Reviews: integrated views of complete genomes and proteomesNucleic Acids Research, 2004
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- Evolution of Protein Superfamilies and Bacterial Genome SizeJournal of Molecular Biology, 2004
- The Pfam protein families databaseNucleic Acids Research, 2004
- An efficient algorithm for large-scale detection of protein familiesNucleic Acids Research, 2002
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997