Hypothesis Selection and Testing by the MDL Principle
- 1 April 1999
- journal article
- Published by Oxford University Press (OUP) in The Computer Journal
- Vol. 42 (4) , 260-269
- https://doi.org/10.1093/comjnl/42.4.260
Abstract
The central idea of the MDL (Minimum Description Length) principle is to represent a class of models (hypotheses) by a universal model capable of imitating the behavior of any model in the class. The principle calls for a model class whose representative assigns the largest probability or density to the observed data. Two examples of universal models for parametric classes ${\cal M}$ are the normalized maximum likelihood (NML) model \[ \hat f (x^n\mid{\cal M})=f(x^n\mid\hat \theta(x^n))\bigg/\int_\Omega f(y^n\mid\hat\theta(y^n))\,{\rm d}y^n, \] where $\Omega$ is an appropriately selected set, and a mixture \[ f_w(x^n\mid {\cal M})=\int f(x^n\mid\theta)w(\theta)\,{\rm d}\theta \] as a convex linear functional of the models. In this interpretation a Bayes factor $B-f_w(x^n\mid{\cal M}_1)/f_v (x^n\mid{\cal M}_2)$ is the ratio of mixture representatives of two model classes. However, mixtures need not be the best representatives, and as will be shown the NML model provides a strictly better test for the mean being zero in the Gaussian cases where the variance is known or taken as a parameter.
Keywords
This publication has 0 references indexed in Scilit: