DOUTfinder--identification of distant domain outliers using subsignificant sequence similarity
Open Access
- 1 July 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 34 (Web Server) , W214-W218
- https://doi.org/10.1093/nar/gkl332
Abstract
DOUTfinder is a web-based tool facilitating protein domain detection among related protein sequences in the twilight zone of sequence similarity. The sequence set required for this analysis can be provided by the user or will be collected using PSI-BLAST if a single sequence is given as an input. The obtained sequence family is analyzed for known Pfam and SMART domains, and the thereby identified subsignificant domain similarities are evaluated further. Domains with several subthreshold hits in the query set are ranked based on a sum-score function and likely homologous domains are suggested according to established cut-offs. By providing a post-filtering procedure for subsignificant domain hits DOUTfinder allows the detection of non-trivial domain relationships and can thereby lead to new insights into the function and evolution of distantly related sequence families. DOUTfinder is available at http://mendel.imp.ac.at/dout/.Keywords
This publication has 17 references indexed in Scilit:
- The Universal Protein Resource (UniProt): an expanding universe of protein informationNucleic Acids Research, 2006
- SMART 5: domains in the context of genomes and networksNucleic Acids Research, 2006
- Pfam: clans, web tools and servicesNucleic Acids Research, 2006
- CDD: a Conserved Domain Database for protein classificationNucleic Acids Research, 2004
- CD-Search: protein domain annotations on the flyNucleic Acids Research, 2004
- Enhanced protein domain discovery using taxonomyBMC Bioinformatics, 2004
- The ASTRAL Compendium in 2004Nucleic Acids Research, 2004
- The Pfam protein families databaseNucleic Acids Research, 2004
- Protein domain analysis in the era of complete genomesFEBS Letters, 2001
- Clustering of highly homologous sequences to reduce the size of large protein databasesBioinformatics, 2001