The importance of larger data sets for protein secondary structure prediction with neural networks
Open Access
- 1 April 1996
- journal article
- research article
- Published by Wiley in Protein Science
- Vol. 5 (4) , 768-774
- https://doi.org/10.1002/pro.5560050422
Abstract
A neural network algorithm is applied to secondary structure and structural class prediction for a database of 318 nonhomologous protein chains. Significant improvement in accuracy is obtained as compared with performance on smaller databases. A systematic study of the effects of network topology shows that, for the larger database, better results are obtained with more units in the hidden layer. In a 32-fold cross validated test, secondary structure prediction accuracy is 67.0%, relative to 62.6% obtained previously, without any evolutionary information on the sequence. Introduction of sequence profiles increases this value to 72.9%, suggesting that the two types of information are essentially independent. Tertiary structural class is predicted with 80.2% accuracy, relative to 73.9% obtained previously. The use of a larger database is facilitated by the introduction of a scaled conjugate gradient algorithm for optimizing the neural network. This algorithm is about 10–20 times as fast as the standard steepest descent algorithm.Keywords
Funding Information
- National Science Foundation
This publication has 24 references indexed in Scilit:
- Neural networks for secondary structure and structural class predictionsProtein Science, 1995
- Theory and Applications of Neural Computing in Chemical ScienceAnnual Review of Physical Chemistry, 1994
- Protein folding dynamics: The diffusion‐collision model and experimental dataProtein Science, 1994
- Redefining the goals of protein secondary structure predictionJournal of Molecular Biology, 1994
- Prediction of Protein Secondary Structure at Better than 70% AccuracyJournal of Molecular Biology, 1993
- Hybrid system for protein secondary structure predictionJournal of Molecular Biology, 1992
- Improvements in protein secondary structure prediction by an enhanced neural networkJournal of Molecular Biology, 1990
- Predicting the secondary structure of globular proteins using neural network modelsJournal of Molecular Biology, 1988
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- Structural patterns in globular proteinsNature, 1976