Abstract
We propose a novel learning algorithm to train networks with multilayer linear-threshold or hard-limiting units. The learning scheme is based on the standard backpropagation, but with "pseudo-gradient" descent, which uses the gradient of a sigmoid function as a heuristic hint in place of that of the hard-limiting function. A justification that the pseudo-gradient always points in the right down hill direction in error surface for networks with one hidden layer is provided. The advantages of such networks are that their internal representations in the hidden layers are clearly interpretable, and well-defined classification rules can be easily obtained, that calculations for classifications after training are very simple, and that they are easily implementable in hardware. Comparative experimental results on several benchmark problems using both the conventional backpropagation networks and our learning scheme for multilayer perceptrons are presented and analyzed.