Analysis of amino acid indices and mutation matrices for sequence comparison and structure prediction of proteins
Open Access
- 1 January 1996
- journal article
- Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection
- Vol. 9 (1) , 27-36
- https://doi.org/10.1093/protein/9.1.27
Abstract
An amino acid index is a set of 20 numerical values representing any of the different physicochemical and biochemical properties of amino adds. As a follow-up to the previous study, we have increased the size of the database, which currently contains 402 published indices, and re-performed the single-linkage cluster analysis. The results basically confirmed the previous findings. Another important feature of amino acids that can be represented numerically is the similarity between them. Thus, a similarity matrix, also called a mutation matrix, is a set of 20×20 numerical values used for protein sequence alignments and similarity searches. We have collected 42 published matrices, performed hierarchical cluster analyses and identified several clusters corresponding to the nature of the data set and the method used for constructing the mutation matrix. Further, we have tried to reproduce each mutation matrix by the combination of amino acid indices in order to understand which properties of amino acids are reflected most. There was a relationship between the PAM units of Dayhoff's mutation matrix and the volume and hydrophobicity of amino adds. The database of 402 amino acid indices and 42 amino acid mutation matrices is made publicly available on the Internet.Keywords
This publication has 0 references indexed in Scilit: