Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent

1 July 2002

journal article
Published by MIT Press in Neural Computation

Vol. 14 (7) , 1723-1738
https://doi.org/10.1162/08997660260028683

Abstract

We propose a generic method for iteratively approximating various second-order gradient steps—-Newton, Gauss-Newton, Levenberg-Marquardt, and natural gradient—-in linear time per iteration, using special curvature matrix-vector products that can be computed in O(n). Two recent acceleration techniques for on-line learning, matrix momentum and stochastic meta-descent (SMD), implement this approach. Since both were originally derived by very different routes, this offers fresh insight into their operation, resulting in further improvements to SMD.

Keywords

This publication has 7 references indexed in Scilit:

A Fast, Compact Approximation of the Exponential Function
Neural Computation, 1999
Matrix momentum for practical natural gradient learning
Journal of Physics A: General Physics, 1999
Complexity Issues in Natural Gradient Descent Method for Training Multilayer Perceptrons
Neural Computation, 1998
Natural Gradient Works Efficiently in Learning
Neural Computation, 1998
Fast Exact Multiplication by the Hessian
Neural Computation, 1994
An Algorithm for Least-Squares Estimation of Nonlinear Parameters
Journal of the Society for Industrial and Applied Mathematics, 1963
A method for the solution of certain non-linear problems in least squares
Quarterly of Applied Mathematics, 1944