Analysis of an approximate gradient projection method with applications to the backpropagation algorithm

1 January 1994

journal article
research article
Published by Taylor & Francis in Optimization Methods and Software

Vol. 4 (2) , 85-101
https://doi.org/10.1080/10556789408805580

Abstract

We analyze the convergence of an approximate gradient projection method for minimizing the sum of continuously differentiable functions over a nonempty closed convex set. In this method, the functions are aggregated and, at each iteration, a succession of gradient steps, one for each of the aggregate functions, is applied and the result is projected onto the convex set. We show that if the gradients of the functions are bounded and Lipschitz continuous over a certain level set and the stepsizes are chosen to be proportional to a certain residual squared or to be square summable, then every cluster point of the iterates is a stationary point. We apply these results to the backpropagation algorithm to obtain new deterministic convergence results for this algorithm. We also discuss the issues of parallel implementation and give a simple criterion for choosing the aggregation.

Keywords

This publication has 15 references indexed in Scilit:

Initializing back propagation networks with prototypes
Neural Networks, 1993
A scaled conjugate gradient algorithm for fast supervised learning
Neural Networks, 1993
First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method
Neural Computation, 1992
On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward Networks
Neural Computation, 1991
Pattern classification using neural networks
IEEE Communications Magazine, 1989
An adaptive least squares algorithm for the efficient training of artificial neural networks
IEEE Transactions on Circuits and Systems, 1989
Two-Metric Projection Methods for Constrained Optimization
SIAM Journal on Control and Optimization, 1984
New least-square algorithms
Journal of Optimization Theory and Applications, 1976
Constrained minimization methods
USSR Computational Mathematics and Mathematical Physics, 1966
Convex programming in Hilbert space
Bulletin of the American Mathematical Society, 1964