Universal approximation bounds for superpositions of a sigmoidal function

1 May 1993

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Information Theory

Vol. 39 (3) , 930-945
https://doi.org/10.1109/18.256500

Abstract

Approximation properties of a class of artificial neural networks are established. It is shown that feedforward networks with one layer of sigmoidal nonlinearities achieve integrated squared error of order O(1/n), where n is the number of nodes. The approximated function is assumed to have a bound on the first moment of the magnitude distribution of the Fourier transform. The nonlinear parameters associated with the sigmoidal nodes, as well as the parameters of linear combination, are adjusted in the approximation. In contrast, it is shown that for series expansions with n terms, in which only the parameters of linear combination are adjusted, the integrated squared approximation error cannot be made smaller than order 1/n/sup 2/d/ uniformly for functions satisfying the same smoothness assumption, where d is the dimension of the input to the function. For the class of functions examined, the approximation rate and the parsimony of the parameterization of the networks are shown to be advantageous in high-dimensional settings.

Keywords

This publication has 17 references indexed in Scilit:

Approximation and estimation bounds for artificial neural networks
Machine Learning, 1994
Hinging hyperplanes for regression, classification, and function approximation
IEEE Transactions on Information Theory, 1993
A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network Training
The Annals of Statistics, 1992
Convergence Rates of Approximation by Translates
Published by Defense Technical Information Center (DTIC) ,1992
Kolmogorov's theorem and multilayer neural networks
Neural Networks, 1992
On the Power of Threshold Circuits with Small Weights
SIAM Journal on Discrete Mathematics, 1991
Minimum complexity density estimation
IEEE Transactions on Information Theory, 1991
Learning decision trees using the Fourier spectrum
Published by Association for Computing Machinery (ACM) ,1991
Connectionist nonparametric regression: Multilayer feedforward networks can learn arbitrary mappings
Neural Networks, 1990
n-Widths in Approximation Theory
Published by Springer Nature ,1985