A computational strategy for protein function assignment which addresses the multidomain problem
Open Access
- 2 October 2002
- journal article
- research article
- Published by Wiley in Comparative and Functional Genomics
- Vol. 3 (5) , 423-440
- https://doi.org/10.1002/cfg.208
Abstract
A method for assigning functions to unknown sequences based on finding correlations between short signals and functional annotations in a protein database is presented. This approach is based on keyword (KW) and feature (FT) information stored in the SWISS‐PROT database. The former refers to particular protein characteristics and the latter locates these characteristics at a specific sequence position. In this way, a certain keyword is only assigned to a sequence if sequence similarity is found in the position described by the FT field. Exhaustive tests performed over sequences with homologues (cluster set) and without homologues (singleton set) in the database show that assigning functions is much ‘cleaner’ when information about domains (FT field) is used, than when only the keywords are used. Copyright © 2002 John Wiley & Sons, Ltd.Keywords
Funding Information
- European Union (1FD97-0372, QLRT-2000-01473)
This publication has 57 references indexed in Scilit:
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- Initial sequencing and analysis of the human genomeNature, 2001
- Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scoresJournal of Molecular Biology, 2000
- Protein folds, functions and evolutionJournal of Molecular Biology, 1999
- The relationship between protein structure and function: a comprehensive survey with application to the yeast genomeJournal of Molecular Biology, 1999
- A novel method for automatic functional annotation of proteins.Bioinformatics, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- The novel hexapeptide motif found in the acyltransferases LpxA and LpxD of lipid A biosynthesis is conserved in various bacteriaFEBS Letters, 1994
- Basic local alignment search toolJournal of Molecular Biology, 1990
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970