Topology and Geometry of Single Hidden Layer Network, Least Squares Weight Solutions

1 July 1995

journal article
Published by MIT Press in Neural Computation

Vol. 7 (4) , 672-705
https://doi.org/10.1162/neco.1995.7.4.672

Abstract

In this paper the topological and geometric properties of the weight solutions for multilayer perceptron (MLP) networks under the MSE error criterion are characterized. The characterization is obtained by analyzing a homotopy from linear to nonlinear networks in which the hidden node function is slowly transformed from a linear to the final sigmoidal nonlinearity. Two different geometric perspectives for this optimization process are developed. The generic topology of the nonlinear MLP weight solutions is described and related to the geometric interpretations, error surfaces, and homotopy paths, both analytically and using carefully constructed examples. These results illustrate that although the natural homotopy provides a practically valuable heuristic for training, it suffers from a number of theoretical and practical difficulties. The linear system is a bifurcation point of the homotopy equations, and solution paths are therefore generically discontinuous. Bifurcations and infinite solutions further occur for data sets that are not of measure zero. These results weaken the guarantees on global convergence and exhaustive behavior normally associated with homotopy methods. However, the analyses presented provide a clear understanding of the relationship between linear and nonlinear perceptron networks, and thus a firm foundation for development of more powerful training methods. The geometric perspectives and generic topological results describing the nature of the solutions are further generally applicable to network analysis and algorithm evaluation.

Keywords

This publication has 11 references indexed in Scilit:

Global optimization of statistical functions with simulated annealing
Journal of Econometrics, 1994
On the Geometry of Feedforward Neural Network Error Surfaces
Neural Computation, 1993
Uniqueness of the weights for minimal feedforward nets with a given input-output map
Neural Networks, 1992
A simple method to derive bounds on the size and to train multilayer neural networks
IEEE Transactions on Neural Networks, 1991
An Analysis of the Elastic Net Approach to the Traveling Salesman Problem
Neural Computation, 1989
Neural networks and principal component analysis: Learning from examples without local minima
Neural Networks, 1989
Algorithm 652
ACM Transactions on Mathematical Software, 1987
The homotopy continuation method: numerically implementable topological procedures
Transactions of the American Mathematical Society, 1978
The Relationship Between Variable Selection and Data Agumentation and a Method for Prediction
Technometrics, 1974
The Differentiation of Pseudo-Inverses and Nonlinear Least Squares Problems Whose Variables Separate
SIAM Journal on Numerical Analysis, 1973