Quantitative sequence-function relationships in proteins based on gene ontology
Open Access
- 8 August 2007
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 8 (1) , 294
- https://doi.org/10.1186/1471-2105-8-294
Abstract
The relationship between divergence of amino-acid sequence and divergence of function among homologous proteins is complex. The assumption that homologs share function – the basis of transfer of annotations in databases – must therefore be regarded with caution. Here, we present a quantitative study of sequence and function divergence, based on the Gene Ontology classification of function. We determined the relationship between sequence divergence and function divergence in 6828 protein families from the PFAM database. Within families there is a broad range of sequence similarity from very closely related proteins – for instance, orthologs in different mammals – to very distantly-related proteins at the limit of reliable recognition of homology.Keywords
This publication has 48 references indexed in Scilit:
- Protein Superfamily Evolution and the Last Universal Common Ancestor (LUCA)Journal of Molecular Evolution, 2006
- Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics researchBioinformatics, 2005
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- The Pfam protein families databaseNucleic Acids Research, 2004
- Quality control in databanks for molecular biologyBioEssays, 2000
- Assessing annotation transfer for genomics: quantifying the relations between protein sequence, structure and function through traditional and probabilistic scoresJournal of Molecular Biology, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Maximum Discrimination Hidden Markov Models of Sequence ConsensusJournal of Computational Biology, 1995
- EF‐hand motifs in inositol phospholipid‐specific phospholipase CFEBS Letters, 1990
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970