Weight Space Structure and Internal Representations: A Direct Approach to Learning and Generalization in Multilayer Neural Networks
- 18 September 1995
- journal article
- research article
- Published by American Physical Society (APS) in Physical Review Letters
- Vol. 75 (12) , 2432-2435
- https://doi.org/10.1103/physrevlett.75.2432
Abstract
We analytically derive the geometrical structure of the weight space in multilayer neural networks in terms of the volumes of couplings associated with the internal representations of the training set. In this framework, focusing on the parity and committee machines, we show how to deduce learning and generalization capabilities, both reinterpreting some known properties and finding new exact results. The relationship between our approach and information theory as well as the Mitchison-Durbin calculation is established. Our results are exact in the limit of a large number of hidden units, whereas for finite a complete geometrical interpretation of symmetry breaking is given.
Keywords
All Related Versions
This publication has 13 references indexed in Scilit:
- Domains of Solutions and Replica Symmetry Breaking in Multilayer Neural NetworksEurophysics Letters, 1994
- Learning and generalization in a two-layer neural network: The role of the Vapnik-Chervonvenkis dimensionPhysical Review Letters, 1994
- Generalization in a Large Committee MachineEurophysics Letters, 1992
- Storage capacity and learning algorithms for two-layer neural networksPhysical Review A, 1992
- Broken symmetries in multilayered perceptronsPhysical Review A, 1992
- Statistical mechanics of a multilayered neural networkPhysical Review Letters, 1990
- Bounds on the learning capacity of some multi-layer networksBiological Cybernetics, 1989
- Optimal storage properties of neural network modelsJournal of Physics A: General Physics, 1988
- The space of interactions in neural network modelsJournal of Physics A: General Physics, 1988
- Geometrical and Statistical Properties of Systems of Linear Inequalities with Applications in Pattern RecognitionIEEE Transactions on Electronic Computers, 1965