On the Relationship between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions
- 1 May 1996
- journal article
- Published by MIT Press in Neural Computation
- Vol. 8 (4) , 819-842
- https://doi.org/10.1162/neco.1996.8.4.819
Abstract
Feedforward networks together with their training algorithms are a class of regression techniques that can be used to learn to perform some task from a set of examples. The question of generalization of network performance from a finite training set to unseen data is clearly of crucial importance. In this article we first show that the generalization error can be decomposed into two terms: the approximation error, due to the insufficient representational capacity of a finite sized network, and the estimation error, due to insufficient information about the target function because of the finite number of samples. We then consider the problem of learning functions belonging to certain Sobolev spaces with gaussian radial basis functions. Using the above-mentioned decomposition we bound the generalization error in terms of the number of basis functions and number of examples. While the bound that we derive is specific for radial basis functions, a number of observations deriving from it apply to any approximation technique. Our result also sheds light on ways to choose an appropriate network architecture for a particular problem and the kinds of problems that can be effectively solved with finite resources, i.e., with a finite number of parameters and finite amounts of data.Keywords
This publication has 11 references indexed in Scilit:
- Regularization Theory and Neural Networks ArchitecturesNeural Computation, 1995
- Approximation properties of a multilayered feedforward artificial neural networkAdvances in Computational Mathematics, 1993
- Decision theoretic generalizations of the PAC model for neural net and other learning applicationsInformation and Computation, 1992
- A Simple Lemma on Greedy Approximation in Hilbert Space and Convergence Rates for Projection Pursuit Regression and Neural Network TrainingThe Annals of Statistics, 1992
- Neural Networks and the Bias/Variance DilemmaNeural Computation, 1992
- Neural Network Classifiers Estimate Bayesian a posteriori ProbabilitiesNeural Computation, 1991
- Fast Learning in Networks of Locally-Tuned Processing UnitsNeural Computation, 1989
- Multilayer feedforward networks are universal approximatorsNeural Networks, 1989
- Queries and concept learningMachine Learning, 1988
- The rates of convergence of kernel regression estimates and classification rulesIEEE Transactions on Information Theory, 1986