Protein domain identification and improved sequence similarity searching using PSI‐BLAST

12 July 2002

journal article
research article
Published by Wiley in Proteins-Structure Function and Bioinformatics

Vol. 48 (4) , 672-681
https://doi.org/10.1002/prot.10175

Abstract

Protein sequences containing more than one structural domain are problematic when used in homology searches where they can either stop an iterative database search prematurely or cause an explosion of a search to common domains. We describe a method, DOMAINATION, that infers domains and their boundaries in a query sequence from local gapped alignments generated using PSI‐BLAST. Through a new technique to recognize domain insertions and permutations, DOMAINATION submits delineated domains as successive database queries in further iterative steps. Assessed over a set of 452 multidomain proteins, the method predicts structural domain boundaries with an overall accuracy of 50% and improves finding distant homologies by 14% compared with PSI‐BLAST. DOMAINATION is available as a web based tool at http://mathbio.nimr.mrc.ac.uk, and the source code is available from the authors upon request. Proteins 2002;48:672–681.

Keywords

This publication has 42 references indexed in Scilit:

Identification of common molecular subsequences
Published by Elsevier ,2004
Identification of Cell-binding Sites on the Laminin α5 N-terminal Domain by Site-directed Mutagenesis
Journal of Biological Chemistry, 2001
Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods
Journal of Molecular Biology, 1998
Dynamic sequence databank searching with templates and multiple alignment
Journal of Molecular Biology, 1998
Domain assignment for protein structures using a consensus approach: Characterization and analysis
Protein Science, 1998
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997
The KH module has an αβ fold
FEBS Letters, 1995
Basic local alignment search tool
Journal of Molecular Biology, 1990
Prediction of the location of structural domains in globular proteins
Protein Journal, 1988
A possible way for prediction of domain boundaries in globular proteins from amino acid sequence
Biochemical and Biophysical Research Communications, 1986