A novel approach to predicting protein structural classes in a (20–1)‐D amino acid composition space
- 1 April 1995
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 21 (4) , 319-344
- https://doi.org/10.1002/prot.340210406
Abstract
The development of prediction methods based on statistical theory generally consists of two parts: one is focused on the exploration of new algorithms, and the other on the improvement of a training database. The current study is devoted to improving the prediction of protein structural classes from both of the two aspects. To explore a new algorithm, a method has been developed that makes allowance for taking into account the coupling effect among different amino acid components of a protein by a covariance matrix. To improve the training database, the selection of proteins is carried out so that they have (1) as many non-homologous structures as possible, and (2) a good quality of structure. Thus, 129 representative proteins are selected. They are classified into 30 α, 30 β, 30 α + β, 30 α/β, and 9 ζ (irregular) proteins according to a new criterion that better reflects the feature of the structural classes concerned. The average accuracy of prediction by the current method for the 4 × 30 regular proteins is 99.2%, and that for 64 independent testing proteins not included in the training database is 95.3%. To further validate its efficiency, a jackknife analysis has been performed for the current method as well as the previous ones, and the results are also much in favor of the current method. To complete the mathematical basis, a theorem is presented and proved in Appendix A that is instructive for understanding the novel method at a deeper level.Keywords
This publication has 46 references indexed in Scilit:
- Enlarged representative set of protein structuresProtein Science, 1994
- Discrimination of folding types of globular proteins based on average distance maps constructed from their sequencesProtein Journal, 1993
- A new approach to predicting protein folding typesProtein Journal, 1993
- Predicting protein secondary structure content: A tandem neural network approachJournal of Molecular Biology, 1992
- One thousand families for the molecular biologistNature, 1992
- Energy-optimized structure of antifreeze protein and its binding mechanismJournal of Molecular Biology, 1992
- Prediction of protein structural class by discriminant analysisBiochimica et Biophysica Acta (BBA) - Protein Structure and Molecular Enzymology, 1986
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteinsJournal of Molecular Biology, 1978
- Structural principles of the globular organization of protein chains. A stereochemical theory of globular protein secondary structureJournal of Molecular Biology, 1974