Learning rate schedules for faster stochastic gradient search
- 2 January 2003
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
The authors propose a new methodology for creating the first automatically adapting learning rates that achieve the optimal rate of convergence for stochastic gradient descent. Empirical tests agree with theoretical expectations that drift can be used to determine whether the crucial parameter c is large enough. Using this statistic, it will be possible to produce the first adaptive learning rates which converge at optimal speed.<>Keywords
This publication has 8 references indexed in Scilit:
- Fast adaptive k-means clustering: some empirical resultsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1990
- Increased rates of convergence through learning rate adaptationNeural Networks, 1988
- The Strong Law of Large Numbers and Normality of Kesten’ ProcedureTheory of Probability and Its Applications, 1979
- Rates of Convergence for Sequential Monte Carlo Optimization MethodsSIAM Journal on Control and Optimization, 1978
- A limit theorem for the Robbins-Monro approximationProbability Theory and Related Fields, 1973
- On Asymptotic Normality in Stochastic ApproximationThe Annals of Mathematical Statistics, 1968
- Accelerated Stochastic ApproximationThe Annals of Mathematical Statistics, 1958
- On a Stochastic Approximation MethodThe Annals of Mathematical Statistics, 1954