Identification of structural motifs from protein coordinate data: Secondary structure and first‐level supersecondary structure*
- 1 January 1988
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 3 (2) , 71-84
- https://doi.org/10.1002/prot.340030202
Abstract
A computer program is described that produces a description of the secondary structure and supersecondary structure of a polypeptide chain using the list of alpha carbon coordinates as input. Restricting the term “secondary structure” to the conformation of contiguous segments of the chain, the program determines the initial and final residues in helices, extended strands, sharp turns, and omega loops. This is accomplished through the use of difference distance matrices. The distances in idealized models of the segments are compared with the actual structure, and the differences are evaluated for agreement within preset limits. The program assigns 90–95% of the residues in most proteins to at least one type of secondary element In a second step the now-defined helices and strands are idealized as straight line segments, and the axial directions and locations are compiled from the input Cα coordinate list. These data are used to check for moderate curvature in strands and helices, and the secondary structure list is corrected where necessary. The geometric relations between these line segments are then calculated and output as the first level of supersecondary structure. A maximum of six parameters are required for a complete description of the relations between each pair. Frequently a less complete description will suffice, for example just the interaxial separation and angle. Both the secondary structure and one aspect of the supersecondary structure can be displayed in a character matrix analogous to the distance matrix format. This allows a quite accurate two-dimensional display of the three-dimensional structure, and several examples are presented A procedure for searching for arbitrary substructures in proteins using distance matrices is also described. A search for the DNA binding helix-turnhelix motif in the Protein Data Bank serves as an example A further abstraction of the above data can be made in the form of a metamatrix where each diagonal element represents an entire secondary segment rather than a single atom, and the off-diagonal elements contain all the parameters describing their interrelations. Such matrices can be used in a straightforward search for higher levels of supersecondary structure or used in toto as a representation of the entire tertiary structure of the polypeptide chain.Keywords
This publication has 33 references indexed in Scilit:
- Tertiary templates for proteins: Use of packing criteria in the enumeration of allowed sequences for different structural classesPublished by Elsevier ,2005
- Crystal structure of hen egg-white lysozyme at a hydrostatic pressure of 1000 atmospheresJournal of Molecular Biology, 1987
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- Crystallographic refinement and atomic models of two different forms of citrate synthase at 2·7 and 1·7 Å resolutionJournal of Molecular Biology, 1982
- REVERSALS OF POLYPEPTIDE CHAIN IN GLOBULAR PROTEINSInternational Journal of Peptide and Protein Research, 1980
- β-turns in proteinsJournal of Molecular Biology, 1977
- Automatic identification of secondary structure in globular proteinsJournal of Molecular Biology, 1977
- A new algorithm for finding the peptide chain turns in a globular proteinJournal of Molecular Biology, 1977
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977
- Atomic coordinates for triose phosphate isomerase from chicken muscleBiochemical and Biophysical Research Communications, 1976