Protein–nucleic acid recognition: Statistical analysis of atomic interactions and influence of DNA structure

Abstract
We analyzed structural features of 11,038 direct atomic contacts (either electrostatic, H‐bonds, hydrophobic, or other van der Waals interactions) extracted from 139 protein–DNA and 49 protein–RNA nonhomologous complexes from the Protein Data Bank (PDB). Globally, H‐bonds are the most frequent interactions (∼50%), followed by van der Waals, hydrophobic, and electrostatic interactions. From the protein viewpoint, hydrophilic amino acids are over‐represented in the interaction databases: Positively charged amino acids mainly contact nucleic acid phosphate groups but can also interact with base edges. From the nucleotide point of view, DNA and RNA behave differently: Most protein–DNA interactions involve phosphate atoms, while protein–RNA interactions involve more frequently base edge and ribose atoms. The increased participation of DNA phosphate involves H‐bonds rather than salt bridges. A statistical analysis was performed to find the occurrence of amino acid–nucleotide pairs most different from chance. These pairs were analyzed individually. Finally, we studied the conformation of DNA in the interaction sites. Despite the prevalence of B‐DNA in the database, our results suggest that A‐DNA is favored in the interaction sites. Proteins 2005.