Delaunay Tessellation of Proteins: Four Body Nearest-Neighbor Propensities of Amino Acid Residues

Abstract
Delaunay tessellation is applied for the first time in the analysis of protein structure. By representing amino acid residues in protein chains by Cα atoms, the protein is described as a set of points in three-dimensional space. Delaunay tessellation of a protein structure generates an aggregate of space-filling irregular tetrahedra, or Delaunay simplices. The vertices of each simplex define objectively four nearest neighbor Cα atoms, i.e., four nearest-neighbor residues. A simplex classification scheme is introduced in which simplices are divided into five classes based on the relative positions of vertex residues in protein primary sequence. Statistical analysis of the residue composition of Delaunay simplices reveals nonrandom preferences for certain quadruplets of amino acids to be clustered together. This nonrandom preference may be used to develop a four-body potential that can be used in evaluating sequence–structure compatibililty for the purpose of inverted structure prediction.