On-line learning with minimal degradation in feedforward networks

Abstract
Dealing with non-stationary processes requires quick adaptation while at the same time avoiding catastrophic forgetting. A neural learning technique that satisfies these requirements, without sacrifying the benefits of distributed representations, is presented. It relies on a formalization of the problem as the minimization of the error over the previously learned input-output (i-o) patterns, subject to the constraint of perfect encoding of the new pattern. Then this constrained optimization problem is transformed into an unconstrained one with hidden-unit activations as variables. This new formulation naturally leads to an algorithm for solving the problem, which we call Learning with Minimal Degradation (LMD). Some experimental comparisons of the performance of LMD with back-propagation are provided which, besides showing the advantages of using LMD, reveal the dependence of forgetting on the learning rate in back-propagation. We also explain why overtraining affects forgetting and fault-tolerance, which are seen as related problems.Peer Reviewe