Statistical mechanics of ensemble learning

Abstract
Within the context of learning a rule from examples, we study the general characteristics of learning with ensembles. The generalization performance achieved by a simple model ensemble of linear students is calculated exactly in the thermodynamic limit of a large number of input components and shows a surprisingly rich behavior. Our main findings are the following. For learning in large ensembles, it is advantageous to use underregularized students, which actually overfit the training data. Globally optimal generalization performance can be obtained by choosing the training set sizes of the students optimally. For smaller ensembles, optimization of the ensemble weights can yield significant improvements in ensemble generalization performance, in particular if the individual students are subject to noise in the training process. Choosing students with a wide range of regularization parameters makes this improvement robust against changes in the unknown level of corruption of the training data.

This publication has 20 references indexed in Scilit: