First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method
- 1 March 1992
- journal article
- Published by MIT Press in Neural Computation
- Vol. 4 (2) , 141-166
- https://doi.org/10.1162/neco.1992.4.2.141
Abstract
On-line first-order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.Keywords
This publication has 13 references indexed in Scilit:
- On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward NetworksNeural Computation, 1991
- Regularization Algorithms for Learning That Are Equivalent to Multilayer NetworksScience, 1990
- BFGS Optimization for Faster and Automated Supervised LearningPublished by Springer Nature ,1990
- Increased rates of convergence through learning rate adaptationNeural Networks, 1988
- An adaptive training algorithm for back propagation networksComputer Speech & Language, 1987
- Algorithm 573: NL2SOL—An Adaptive Nonlinear Least-Squares Algorithm [E4]ACM Transactions on Mathematical Software, 1981
- Updating Quasi-Newton Matrices with Limited StorageMathematics of Computation, 1980
- Conjugate Gradient Methods with Inexact SearchesMathematics of Operations Research, 1978
- Factorized Variable Metric Methods for Unconstrained OptimizationMathematics of Computation, 1976
- On the Local and Superlinear Convergence of Quasi-Newton MethodsIMA Journal of Applied Mathematics, 1973