Analysis of an approximate gradient projection method with applications to the backpropagation algorithm
- 1 January 1994
- journal article
- research article
- Published by Taylor & Francis in Optimization Methods and Software
- Vol. 4 (2) , 85-101
- https://doi.org/10.1080/10556789408805580
Abstract
We analyze the convergence of an approximate gradient projection method for minimizing the sum of continuously differentiable functions over a nonempty closed convex set. In this method, the functions are aggregated and, at each iteration, a succession of gradient steps, one for each of the aggregate functions, is applied and the result is projected onto the convex set. We show that if the gradients of the functions are bounded and Lipschitz continuous over a certain level set and the stepsizes are chosen to be proportional to a certain residual squared or to be square summable, then every cluster point of the iterates is a stationary point. We apply these results to the backpropagation algorithm to obtain new deterministic convergence results for this algorithm. We also discuss the issues of parallel implementation and give a simple criterion for choosing the aggregation.Keywords
This publication has 15 references indexed in Scilit:
- Initializing back propagation networks with prototypesNeural Networks, 1993
- A scaled conjugate gradient algorithm for fast supervised learningNeural Networks, 1993
- First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's MethodNeural Computation, 1992
- On the Convergence of the LMS Algorithm with Adaptive Learning Rate for Linear Feedforward NetworksNeural Computation, 1991
- Pattern classification using neural networksIEEE Communications Magazine, 1989
- An adaptive least squares algorithm for the efficient training of artificial neural networksIEEE Transactions on Circuits and Systems, 1989
- Two-Metric Projection Methods for Constrained OptimizationSIAM Journal on Control and Optimization, 1984
- New least-square algorithmsJournal of Optimization Theory and Applications, 1976
- Constrained minimization methodsUSSR Computational Mathematics and Mathematical Physics, 1966
- Convex programming in Hilbert spaceBulletin of the American Mathematical Society, 1964