Principal eigenvector of contact matrices and hydrophobicity profiles in proteins
- 2 November 2004
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 58 (1) , 22-30
- https://doi.org/10.1002/prot.20240
Abstract
With the aim of studying the relationship between protein sequences and their native structures, we adopted vectorial representations for both sequence and structure. The structural representation was based on the principal eigenvector of the fold's contact matrix (PE). As has been recently shown, the latter encodes sufficient information for reconstructing the whole contact matrix. The sequence was represented through a hydrophobicity profile (HP), using a generalized hydrophobicity scale that we obtained from the principal eigenvector of a residue–residue interaction matrix, and denoted as interactivity scale. Using this novel scale, we defined the optimal HP of a protein fold, and, by means of stability arguments, predicted to be strongly correlated with the PE of the fold's contact matrix. This prediction was confirmed through an evolutionary analysis, which showed that the PE correlates with the HP of each individual sequence adopting the same fold and, even more strongly, with the average HP of this set of sequences. Thus, protein sequences evolve in such a way that their average HP is close to the optimal one, implying that neutral evolution can be viewed as a kind of motion in sequence space around the optimal HP. Our results indicate that the correlation coefficient between N‐dimensional vectors constitutes a natural metric in the vectorial space in which we represent both protein sequences and protein structures, which we call vectorial protein space. In this way, we define a unified framework for sequence‐to‐sequence, sequence‐to‐structure and structure‐to‐structure alignments. We show that the interactivity scale is nearly optimal both for the comparison of sequences to sequences and sequences to structures. Proteins 2005.Keywords
All Related Versions
This publication has 37 references indexed in Scilit:
- Aromatic clusters: a determinant of thermal stability of thermophilic proteinsProtein Engineering, Design and Selection, 2000
- Identification of side-chain clusters in protein structures by a graph spectral method 1 1Edited by J. M. ThorntonJournal of Molecular Biology, 1999
- Progress of 1D protein structure prediction at lastProteins-Structure Function and Bioinformatics, 1995
- Recognizing Native Folds by the Arrangement of Hydrophobic and Polar ResiduesJournal of Molecular Biology, 1995
- Parser for protein folding unitsProteins-Structure Function and Bioinformatics, 1994
- Three-dimensional profiles from residue-pair preferences: identification of sequences with beta/alpha-barrel fold.Proceedings of the National Academy of Sciences, 1993
- A Method to Identify Protein Sequences That Fold into a Known Three-Dimensional StructureScience, 1991
- Identification of protein folds: Matching hydrophobicity patterns of sequence sets with solvent accessibility patterns of known structuresProteins-Structure Function and Bioinformatics, 1990
- Correlation of sequence hydrophobicities measures similarity in three-dimensional protein structureJournal of Molecular Biology, 1983
- A simple method for displaying the hydropathic character of a proteinJournal of Molecular Biology, 1982