Location of structural domains in proteins

Abstract
Surface area measurements based on atomic positions were used to give a quantitative definition of structural domains in proteins. Segments of the polypeptide chain that had a minimum of interactions with the rest of the protein structure are identified on interface area scans, where area B of the interface between a N-terminal segment of i residues and the complementary C-terminal segment is plotted as a function of i. Domain boundaries appear as minima of B in the scans. The procedure may be iterated to build a hierarchy of subdomains. It detects only continuous domains made of a single stretch of polypeptide chain but may be extended to detect such domains in the presence of discontinuous ones. Domains defined from interface area scans fit very well with globular structural regions identified by inspection of protein models. They do not, in general, correspond to the repeated structural units observed in some proteins by superposition studies. In Hb and hen lysozyme, the domains do not correspand to the coding sequences separated by introns in the genes.