Training a Sigmoidal Node Is Hard
- 1 July 1999
- journal article
- Published by MIT Press in Neural Computation
- Vol. 11 (5) , 1249-1260
- https://doi.org/10.1162/089976699300016449
Abstract
This article proves that the task of computing near-optimal weights for sigmoidal nodes under the L1 regression norm is NP-Hard. For the special case where the sigmoid is piecewise linear, we prove a slightly stronger result: that computing the optimal weights is NP-Hard. These results parallel that for the one-node pattern recognition problem—that determining the optimal weights for a threshold logic node is also intractable. Our results have important consequences for constructive algorithms that build a regression model one node at a time. It suggests that although such methods are (in principle) capable of producing efficient size representations (Barron, 1993; Jones, 1992), finding such representations may be computationally intractable. These results holds only in the deterministic sense; that is, they do not exclude the possibility that such representations may be found efficiently with high probability. In fact it motivates the use of heuristic or randomized algorithms for this problem.Keywords
This publication has 9 references indexed in Scilit:
- Efficient algorithms for function approximation with piecewise linear sigmoidal networksIEEE Transactions on Neural Networks, 1998
- The computational intractability of training sigmoidal neural networksIEEE Transactions on Information Theory, 1997
- Back-propagation is not EfficientNeural Networks, 1996
- On the complexity of training neural networks with continuous activation functionsIEEE Transactions on Neural Networks, 1995
- Computational limitations on training sigmoid neural networksInformation Processing Letters, 1993
- Universal approximation bounds for superpositions of a sigmoidal functionIEEE Transactions on Information Theory, 1993
- A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network TrainingThe Annals of Statistics, 1992
- Complexity results on learning by neural netsMachine Learning, 1991
- A Proposal for More Powerful Learning AlgorithmsNeural Computation, 1989