The principal eigenvector of contact matrices and hydrophobicity profiles in proteins

Abstract
With the aim to study the relationship between protein sequences and their native structures, we adopt vectorial representations for both sequence and structure. The structural representation is based on the Principal Eigenvector of the fold's contact matrix (PE). As recently shown, the latter encodes sufficient information for reconstructing the whole contact matrix. The sequence is represented through a Hydrophobicity Profile (HP), using a generalized hydrophobicity scale that we obtain from the principal eigenvector of a residue-residue interaction matrix and denote it as interactivity scale. Using this novel scale, we define the optimal HP of a protein fold, and predict, by means of stability arguments, that it is strongly correlated with the PE of the fold's contact matrix. This prediction is confirmed through an evolutionary analysis, which shows that the PE correlates with the HP of each individual sequence adopting the same fold and, even more strongly, with the average HP of this set of sequences. Thus, protein sequences evolve in such a way that their average HP is close to the optimal one, implying that neutral evolution can be viewed as a kind of motion in sequence space around the optimal HP. Our results indicate that the correlation coefficient between N-dimensional vectors constitutes a natural metric in the vectorial space in which we represent both protein sequences and protein structures, which we call Vectorial Protein Space. In this way, we define a unified framework for sequence to sequence, sequence to structure, and structure to structure alignments. We show that the interactivity scale is nearly optimal both for the comparison of sequences with sequences and sequences with structures.
All Related Versions

This publication has 0 references indexed in Scilit: