Convergence properties of backpropagation for neural nets via theory of stochastic gradient methods. Part 1

1 January 1994

journal article
research article
Published by Taylor & Francis in Optimization Methods and Software

Vol. 4 (2) , 117-134
https://doi.org/10.1080/10556789408805582

Abstract

We study here convergence properties of serial and parallel backpropagation algorithm for training of neural nets, as well as its modification with momentum term. It is shown that these algorithms can be put into the general framework of the stochastic gradient methods. This permits to consider from the same positions both stochastic and deterministic rules for the selection of components (training examples) of the error function to minimize at each iteration. We obtained weaker conditions on the stepsize for deterministic case and provide quite general synchronization rule for parallel version.

Keywords

This publication has 12 references indexed in Scilit:

Mathematical Programming in Neural Networks
INFORMS Journal on Computing, 1993
Stochastic quasigradient methods for optimization of discrete event systems
Annals of Operations Research, 1992
COMBINING IMAGE PROCESSING OPERATORS AND NEURAL NETWORKS IN A FACE RECOGNITION SYSTEM
International Journal of Pattern Recognition and Artificial Intelligence, 1992
Robust linear programming discrimination of two linearly inseparable sets
Optimization Methods and Software, 1992
Approximation methods of solution of stochastic programming problems
Cybernetics and Systems Analysis, 1982
Stochastic Approximation Methods for Constrained and Unconstrained Systems
Published by Springer Nature ,1978
Stochastic Linear Programming
Published by Springer Nature ,1976
Contributions to the theory of stochastic programming
Mathematical Programming, 1973
Stochastic Estimation of the Maximum of a Regression Function
The Annals of Mathematical Statistics, 1952
A Stochastic Approximation Method
The Annals of Mathematical Statistics, 1951