Cooling schedules for learning in neural networks

Abstract
We derive cooling schedules for the global optimization of learning in neural networks. We discuss a two-level system with one global and one local minimum. The analysis is extended to systems with many minima. The optimal cooling schedule is (asymptotically) of the form η(t)=η*/lnt, with η(t) the learning parameter at time t and η* a constant, dependent on the reference learning parameters for the various transitions. In some simple cases, η* can be calculated. Simulations confirm the theoretical results.