First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method

1 March 1992

journal article
Published by MIT Press in Neural Computation

Vol. 4 (2) , 141-166
https://doi.org/10.1162/neco.1992.4.2.141

Abstract

On-line first-order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.

Keywords

This publication has 13 references indexed in Scilit:

On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward Networks
Neural Computation, 1991
Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks
Science, 1990
BFGS Optimization for Faster and Automated Supervised Learning
Published by Springer Nature ,1990
Increased rates of convergence through learning rate adaptation
Neural Networks, 1988
An adaptive training algorithm for back propagation networks
Computer Speech & Language, 1987
Algorithm 573: NL2SOL—An Adaptive Nonlinear Least-Squares Algorithm [E4]
ACM Transactions on Mathematical Software, 1981
Updating Quasi-Newton Matrices with Limited Storage
Mathematics of Computation, 1980
Conjugate Gradient Methods with Inexact Searches
Mathematics of Operations Research, 1978
Factorized Variable Metric Methods for Unconstrained Optimization
Mathematics of Computation, 1976
On the Local and Superlinear Convergence of Quasi-Newton Methods
IMA Journal of Applied Mathematics, 1973