Pruning from Adaptive Regularization
- 1 November 1994
- journal article
- Published by MIT Press in Neural Computation
- Vol. 6 (6) , 1223-1232
- https://doi.org/10.1162/neco.1994.6.6.1223
Abstract
Inspired by the recent upsurge of interest in Bayesian methods we consider adaptive regularization. A generalization based scheme for adaptation of regularization parameters is introduced and compared to Bayesian regularization. We show that pruning arises naturally within both adaptive regularization schemes. As model example we have chosen the simplest possible: estimating the mean of a random variable with known variance. Marked similarities are found between the two methods in that they both involve a “noise limit,” below which they regularize with infinite weight decay, i.e., they prune. However, pruning is not always beneficial. We show explicitly that both methods in some cases may increase the generalization error. This corresponds to situations where the underlying assumptions of the regularizer are poorly matched to the environment.Keywords
This publication has 5 references indexed in Scilit:
- Stochastic linear learning: Exact test and training error averagesNeural Networks, 1993
- A Practical Bayesian Framework for Backpropagation NetworksNeural Computation, 1992
- Bayesian InterpolationNeural Computation, 1992
- IMPROVING GENERALIZATION OF NEURAL NETWORKS THROUGH PRUNINGInternational Journal of Neural Systems, 1991
- Fitting autoregressive models for predictionAnnals of the Institute of Statistical Mathematics, 1969