Bayesian probabilistic approach for predicting backbone structures in terms of protein blocks

Top Cited Papers

15 November 2000

journal article
research article
Published by Wiley in Proteins-Structure Function and Bioinformatics

Vol. 41 (3) , 271-287
https://doi.org/10.1002/1097-0134(20001115)41:3<271::aid-prot10>3.0.co;2-z

Abstract

By using an unsupervised cluster analyzer, we have identified a local structural alphabet composed of 16 folding patterns of five consecutive C_α (“protein blocks”). The dependence that exists between successive blocks is explicitly taken into account. A Bayesian approach based on the relation protein block-amino acid propensity is used for prediction and leads to a success rate close to 35%. Sharing sequence windows associated with certain blocks into “sequence families” improves the prediction accuracy by 6%. This prediction accuracy exceeds 75% when keeping the first four predicted protein blocks at each site of the protein. In addition, two different strategies are proposed: the first one defines the number of protein blocks in each site needed for respecting a user-fixed prediction accuracy, and alternatively, the second one defines the different protein sites to be predicted with a user-fixed number of blocks and a chosen accuracy. This last strategy applied to the ubiquitin conjugating enzyme (α/β protein) shows that 91% of the sites may be predicted with a prediction accuracy larger than 77% considering only three blocks per site. The prediction strategies proposed improve our knowledge about sequence-structure dependence and should be very useful in ab initio protein modelling. Proteins 2000;41:271–287.

Keywords

This publication has 52 references indexed in Scilit:

[32] GOR method for predicting protein secondary structure from amino acid sequence
Published by Elsevier ,2004
Estimating the total number of protein folds
Proteins-Structure Function and Bioinformatics, 1999
New methods for accurate prediction of protein secondary structure
Proteins-Structure Function and Bioinformatics, 1999
Predicting protein secondary structure with probabilistic schemata of evolutionarily derived information
Protein Science, 1997
Protein secondary structure prediction using local alignments
Journal of Molecular Biology, 1997
Improvement of protein secondary structure prediction using binary word encoding
Proteins-Structure Function and Bioinformatics, 1997
Why are some proteins structures so common?
Proceedings of the National Academy of Sciences, 1996
The importance of larger data sets for protein secondary structure prediction with neural networks
Protein Science, 1996
[31] PHD: Predicting one-dimensional protein structure by profile-based neural networks
Published by Elsevier ,1996
Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins
Journal of Molecular Biology, 1978