Information theory and local learning rules in a self-organizing network of Ising spins

1 September 1995

journal article
research article
Published by American Physical Society (APS) in Physical Review E

Vol. 52 (3) , 2860-2871
https://doi.org/10.1103/physreve.52.2860

Abstract

The Boltzmann machine uses the relative entropy as a cost function to fit the Boltzmann distribution to a fixed given distribution. Instead of the relative entropy, we use the mutual information between input and output units to define an unsupervised analogy to the conventional Boltzmann machine. Our network of Ising spins is fed by an external field via the input units. The output units should self-organize to form an ‘‘internal’’ representation of the ‘‘environmental’’ input, thereby compressing the data and extracting relevant features. The mutual information and its gradient with respect to the weights principally require nonlocal information, e.g., in the form of multipoint correlation functions. Hence the exact gradient can hardly be boiled down to a local learning rule. Conversely, by using only local terms and two-point interactions, the entropy of the output layer cannot be ensured to reach the maximum possible entropy for a fixed number of output neurons. Some redundancy may remain in the representation of the data at the output. We account for this limitation from the very beginning by reformulating the cost function correspondingly. From this cost function, local Hebb-like learning rules can be derived. Some experiments with these local learning rules are presented.

Keywords

This publication has 17 references indexed in Scilit:

Efficient information transfer and anti-Hebbian neural networks
Neural Networks, 1993
Supervised Factorial Learning
Neural Computation, 1993
Local Synaptic Learning Rules Suffice to Maximize Mutual Information in a Linear Network
Neural Computation, 1992
Learning by maximizing the information transfer through nonlinear noisy neurons and ‘‘noise breakdown’’
Physical Review A, 1992
Forming sparse representations by local anti-Hebbian learning
Biological Cybernetics, 1990
Towards a Theory of Early Visual Processing
Neural Computation, 1990
Development of feature detectors by self-organization
Biological Cybernetics, 1990
A Self-Organizing Network for Principal-Component Analysis
Europhysics Letters, 1989
How to Generate Ordered Maps by Maximizing the Mutual Information between Input and Output Signals
Neural Computation, 1989
Self-organization in a perceptual network
Computer, 1988