ProtoNet: hierarchical classification of the protein space
- 1 January 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 31 (1) , 348-352
- https://doi.org/10.1093/nar/gkg096
Abstract
The ProtoNet site provides an automatic hierarchical clustering of the SWISS-PROT protein database. The clustering is based on an all-against-all BLAST similarity search. The similarities E - score is used to perform a continuous bottom-up clustering process by applying alternative rules for merging clusters. The outcome of this clustering process is a classification of the input proteins into a hierarchy of cluster of varying degree of granularity. ProtoNet ( version 1.3) is accessible in the form of an interactive web site at http: / / www. protonet. cs. huji. ac. il. ProtoNet provides navigation tools for monitoring the clustering process with a vertical and horizontal view. Each cluster at any level of the hierarchy is assigned with a statistical index, indicating the level of purity based on biological keyword such as those provided by SWISS-PROT and InterPro. ProtoNet can be used for function prediction, for de ning superfamilies and subfamilies and for large-scale protein annotation purposes.Keywords
This publication has 18 references indexed in Scilit:
- The SYSTERS protein sequence cluster setNucleic Acids Research, 2000
- ProtoMap: automatic classification of protein sequences and hierarchy of protein familiesNucleic Acids Research, 2000
- The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000Nucleic Acids Research, 2000
- ProDom and ProDom-CG: tools for protein domain analysis and whole genome comparisonsNucleic Acids Research, 2000
- The Pfam Protein Families DatabaseNucleic Acids Research, 2000
- PRINTS-S: the database formerly known as PRINTSNucleic Acids Research, 2000
- The Protein Data BankNucleic Acids Research, 2000
- SMART: a web-based tool for the study of genetically mobile domainsNucleic Acids Research, 2000
- ProtoMap: automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space.1999
- The PROSITE database, its status in 1999Nucleic Acids Research, 1999