Training a Sigmoidal Node Is Hard

1 July 1999

journal article
Published by MIT Press in Neural Computation

Vol. 11 (5) , 1249-1260
https://doi.org/10.1162/089976699300016449

Abstract

This article proves that the task of computing near-optimal weights for sigmoidal nodes under the L₁ regression norm is NP-Hard. For the special case where the sigmoid is piecewise linear, we prove a slightly stronger result: that computing the optimal weights is NP-Hard. These results parallel that for the one-node pattern recognition problem—that determining the optimal weights for a threshold logic node is also intractable. Our results have important consequences for constructive algorithms that build a regression model one node at a time. It suggests that although such methods are (in principle) capable of producing efficient size representations (Barron, 1993; Jones, 1992), finding such representations may be computationally intractable. These results holds only in the deterministic sense; that is, they do not exclude the possibility that such representations may be found efficiently with high probability. In fact it motivates the use of heuristic or randomized algorithms for this problem.

Keywords

This publication has 9 references indexed in Scilit:

Efficient algorithms for function approximation with piecewise linear sigmoidal networks
IEEE Transactions on Neural Networks, 1998
The computational intractability of training sigmoidal neural networks
IEEE Transactions on Information Theory, 1997
Back-propagation is not Efficient
Neural Networks, 1996
On the complexity of training neural networks with continuous activation functions
IEEE Transactions on Neural Networks, 1995
Computational limitations on training sigmoid neural networks
Information Processing Letters, 1993
Universal approximation bounds for superpositions of a sigmoidal function
IEEE Transactions on Information Theory, 1993
A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training
The Annals of Statistics, 1992
Complexity results on learning by neural nets
Machine Learning, 1991
A Proposal for More Powerful Learning Algorithms
Neural Computation, 1989