Sequence comparison by sequence harmony identifies subtype-specific functional sites
Open Access
- 27 November 2006
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 34 (22) , 6540-6548
- https://doi.org/10.1093/nar/gkl901
Abstract
Multiple sequence alignments are often used to reveal functionally important residues within a protein family. They can be particularly useful for the identification of key residues that determine functional differences between protein subfamilies. We present a new entropy-based method, Sequence Harmony (SH) that accurately detects subfamily-specific positions from a multiple sequence alignment. The SH algorithm implements a novel formula, able to score compositional differences between subfamilies, without imposing conservation, in a simple manner on an intuitive scale. We compare our method with the most important published methods, i.e. AMAS, TreeDet and SDP-pred, using three well-studied protein families: the receptor-binding domain (MH2) of the Smad family of transcription factors, the Ras-superfamily of small GTPases and the MIP-family of integral membrane transporters. We demonstrate that SH accurately selects known functional sites with higher coverage than the other methods for these test-cases. This shows that compositional differences between protein subfamilies provide sufficient basis for identification of functional sites. In addition, SH selects a number of sites of unknown function that could be interesting candidates for further experimental investigation.Keywords
This publication has 39 references indexed in Scilit:
- Pfam: clans, web tools and servicesNucleic Acids Research, 2006
- Accurate Detection of Very Sparse Sequence MotifsJournal of Computational Biology, 2004
- Smad2 Phosphorylation by Type I ReceptorPublished by Elsevier ,2004
- A Family of Evolution–Entropy Hybrid Methods for Ranking Protein Residues by ImportanceJournal of Molecular Biology, 2004
- Automatic Methods for Predicting Functionally Important ResiduesJournal of Molecular Biology, 2003
- Using Orthologous and Paralogous Proteins to Identify Specificity-determining Residues in Bacterial Transcription FactorsJournal of Molecular Biology, 2002
- The TGFβ Receptor Activation ProcessMolecular Cell, 2001
- Analysis and prediction of functional sub-types from protein sequence alignmentsJournal of Molecular Biology, 2000
- Using substitution probabilities to improve position-specific scoring matricesBioinformatics, 1996
- An Evolutionary Trace Method Defines Binding Surfaces Common to Protein FamiliesJournal of Molecular Biology, 1996