Why natural gradient?

Abstract
Gradient adaptation is a useful technique for acl- justing a set of parameters to minimize a cost fun(:- tion. While often easy to implement, the convey- gence speed of gradient adaptation can be slow when the slope of the cost function varies widely for small changes in the parameters. In this papel., we outline an alternative technique, termed natural gradient adaptation, that overcomes the poor con- vergence properties of gradient adaptation in many cases. The natural gradient is based on differential geometry and employs knowledge of the Rieman- nian structure of the parameter space to adjust the gradient search direction. Unlike Newton's method, natural gradient adaptation does not assume a lo- cally-quadratic cost function. Moreover, for max- imum likelihood estimation tasks, natural grad i- ent adaptation is asymptotically Fisher-efficient. A simple e xample illustrates the desirable properties of natural gradient adaptation.

This publication has 1 reference indexed in Scilit: