Growing and pruning neural tree networks

Abstract
A new pattern classification method called Neural Tree Networks (NTN) is presented. The NTN consists of neural networks connected in a tree architecture. The neural networks are used to recursively partition the feature space into subregions. Each terminal subregion is assigned a class label which depends on the training data routed to it by the neural networks. In decision tree algorithms, the feature space is usually partitioned by using hyperplanes that are constrained to be perpendicular to the feature axes. The use of a neural network at each tree node, as in the NTN, results in better classification performance. The NTN is grown by a learning algorithm as opposed to Multi-Layer Perceptrons (MLP) where the architecture must be specified before learning can begin. This is a drawback of the MLP approach since, in many practical problems, numerous trials are needed before an appropriate architecture can be found. The NTN also provides an attractive tradeoff of classification speed versus hardware implementation as compared to MLP's. Growing the smallest NTN that will correctly classify the training data is an NP-complete problem as is the problem of training an MLP. In this paper, a heuristic learning algorithm based on minimizing the L1 norm of the error is used to grow the NTN. We show that this method has better performance in terms of minimizing the number of classification errors than the squared error minimization method used in backpropagation. An optimal pruning algorithm is given to enhance the generalization of the NTN. Simulation results are presented on Boolean function learning tasks and a speaker independent vowel recognition task. It is shown that the NTN compares favorably to both neural networks and decision trees.