Use of a database of structural alignments and phylogenetic trees in investigating the relationship between sequence and structural variability among homologous proteins
Open Access
- 1 April 2001
- journal article
- research article
- Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection
- Vol. 14 (4) , 219-226
- https://doi.org/10.1093/protein/14.4.219
Abstract
The database PALI (Phylogeny and ALIgnment of homologous protein structures) consists of families of protein domains of known three-dimensional (3D) structure. In a PALI family, every member has been structurally aligned with every other member (pairwise) and also simultaneous superposition (multiple) of all the members has been performed. The database also contains 3D structure-based and structure-dependent sequence similarity-based phylogenetic dendrograms for all the families. The PALI release used in the present analysis comprises 225 families derived largely from the HOMSTRAD and SCOP databases. The quality of the multiple rigid-body structural alignments in PALI was compared with that obtained from COMPARER, which encodes a procedure based on properties and relationships. The alignments from the two procedures agreed very well and variations are seen only in the low sequence similarity cases often in the loop regions. A validation of Direct Pairwise Alignment (DPA) between two proteins is provided by comparing it with Pairwise alignment extracted from Multiple Alignment of all the members in the family (PMA). In general, DPA and PMA are found to vary rarely. The ready availability of pairwise alignments allows the analysis of variations in structural distances as a function of sequence similarities and number of topologically equivalent Cα atoms. The structural distance metric used in the analysis combines root mean square deviation (r.m.s.d.) and number of equivalences, and is shown to vary similarly to r.m.s.d. The correlation between sequence similarity and structural similarity is poor in pairs with low sequence similarities. A comparison of sequence and 3D structure-based phylogenies for all the families suggests that only a few families have a radical difference in the two kinds of dendrograms. The difference could occur when the sequence similarity among the homologues is low or when the structures are subjected to evolutionary pressure for the retention of function. The PALI database is expected to be useful in furthering our understanding of the relationship between sequences and structures of homologous proteins and their evolution.Keywords
This publication has 44 references indexed in Scilit:
- Universally conserved positions in protein folds: reading evolutionary signals about stability, folding kinetics and functionJournal of Molecular Biology, 1999
- Evolution of protein sequences and structuresJournal of Molecular Biology, 1999
- CATH – a hierarchic classification of protein domain structuresPublished by Elsevier ,1997
- Structural Features can be Unconserved in Proteins with Similar FoldsJournal of Molecular Biology, 1994
- Tertiary structural constraints on protein evolutionary diversity: templates, key residues and structure predictionProceedings Of The Royal Society B-Biological Sciences, 1990
- Definition of general topological equivalence in protein structuresJournal of Molecular Biology, 1990
- NoticesCladistics, 1989
- Evolution of proteins formed by β-sheetsJournal of Molecular Biology, 1982
- How different amino acid sequences determine similar protein structures: The structure and evolutionary dynamics of the globinsJournal of Molecular Biology, 1980
- Exploring structural homology of proteinsJournal of Molecular Biology, 1976