Hidden Markov model approach for identifying the modular framework of the protein backbone
Open Access
- 1 December 1999
- journal article
- research article
- Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection
- Vol. 12 (12) , 1063-1073
- https://doi.org/10.1093/protein/12.12.1063
Abstract
The hidden Markov model (HMM) was used to identify recurrent short 3D structural building blocks (SBBs) describing protein backbones, independently of any a priori knowledge. Polypeptide chains are decomposed into a series of short segments defined by their inter-α-carbon distances. Basically, the model takes into account the sequentiality of the observed segments and assumes that each one corresponds to one of several possible SBBs. Fitting the model to a database of non-redundant proteins allowed us to decode proteins in terms of 12 distinct SBBs with different roles in protein structure. Some SBBs correspond to classical regular secondary structures. Others correspond to a significant subdivision of their bounding regions previously considered to be a single pattern. The major contribution of the HMM is that this model implicitly takes into account the sequential connections between SBBs and thus describes the most probable pathways by which the blocks are connected to form the framework of the protein structures. Validation of the SBBs code was performed by extracting SBB series repeated in recoding proteins and examining their structural similarities. Preliminary results on the sequence specificity of SBBs suggest promising perspectives for the prediction of SBBs or series of SBBs from the protein sequences.Keywords
This publication has 46 references indexed in Scilit:
- Analyzing patterns between regular secondary structures using short structural building blocks defined by a hidden Markov modelTheoretical Chemistry Accounts, 1999
- Prediction of local structure in proteins using a library of sequence-structure motifsJournal of Molecular Biology, 1998
- A hidden Markov model approach to neuron firing patternsBiophysical Journal, 1996
- Rules for α-Helix Termination by GlycineScience, 1994
- Comparison of three algorithms for the assignment of secondary structure in proteins: the advantages of a consensus assignmentProtein Engineering, Design and Selection, 1993
- Prediction of protein secondary structure by the hidden Markov modelBioinformatics, 1993
- Stochastic models for heterogeneous DNA sequencesBulletin of Mathematical Biology, 1989
- Amino acid distribution in protein secondary structuresInternational Journal of Peptide and Protein Research, 1982
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977
- A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov ChainsThe Annals of Mathematical Statistics, 1970