Comparison of Approximate Methods for Handling Hyperparameters
- 1 July 1999
- journal article
- Published by MIT Press in Neural Computation
- Vol. 11 (5) , 1035-1068
- https://doi.org/10.1162/089976699300016331
Abstract
I examine two approximate methods for computational implementation of Bayesian hierarchical models, that is, models that include unknown hyperparameters such as regularization constants and noise levels. In the evidence framework, the model parameters are integrated over, and the resulting evidence is maximized over the hyperparameters. The optimized hyperparameters are used to define a gaussian approximation to the posterior distribution. In the alternative MAP method, the true posterior probability is found by integrating over the hyperparameters. The true posterior is then maximized over the model parameters, and a gaussian approximation is made. The similarities of the two approaches and their relative merits are discussed, and comparisons are made with the ideal hierarchical Bayesian solution.In moderately ill-posed problems, integration over hyperparameters yields a probability distribution with a skew peak, which causes signifi-cant biases to arise in the MAP method. In contrast, the evidence framework is shown to introduce negligible predictive error under straightforward conditions. General lessons are drawn concerning inference in many dimensions.Keywords
This publication has 7 references indexed in Scilit:
- A review of Bayesian neural networks with an application to near infrared spectroscopyIEEE Transactions on Neural Networks, 1996
- The Evidence Framework Applied to Classification NetworksNeural Computation, 1992
- A Practical Bayesian Framework for Backpropagation NetworksNeural Computation, 1992
- Bayesian InterpolationNeural Computation, 1992
- Learning representations by back-propagating errorsNature, 1986
- Maximum Likelihood from Incomplete Data Via the EM AlgorithmJournal of the Royal Statistical Society Series B: Statistical Methodology, 1977
- Smoothing noisy data with spline functionsNumerische Mathematik, 1975