Hierarchical, Unsupervised Learning with Growing via Phase Transitions
- 15 February 1996
- journal article
- Published by MIT Press in Neural Computation
- Vol. 8 (2) , 425-450
- https://doi.org/10.1162/neco.1996.8.2.425
Abstract
We address unsupervised learning subject to structural constraints, with particular emphasis placed on clustering with an imposed decision tree structure. Most known methods are greedy, optimizing one node of the tree at a time to minimize a local cost. By constrast, we develop a joint optimization method, derived based on information-theoretic principles and closely related to known methods in statistical physics. The approach is inspired by the deterministic annealing algorithm for unstructured data clustering, which was based on maximum entropy inference. The new approach is founded on the principle of minimum cross-entropy, using informative priors to approximate the unstructured clustering solution while imposing the structural constraint. The resulting method incorporates supervised learning principles applied in an unsupervised problem setting. In our approach, the tree “grows” by a sequence of bifurcations that occur while optimizing an effective free energy cost at decreasing temperature scales. Thus, estimates of the tree size and structure are naturally obtained at each temperature in the process. Examples demonstrate considerable improvement over known methods.Keywords
This publication has 27 references indexed in Scilit:
- Speaker recognition using neural networks and conventional classifiersIEEE Transactions on Speech and Audio Processing, 1994
- Vector quantization with complexity costsIEEE Transactions on Information Theory, 1993
- Complexity Optimized Data Clustering by Competitive Neural NetworksNeural Computation, 1993
- An Analysis of the Elastic Net Approach to the Traveling Salesman ProblemNeural Computation, 1989
- Optimal pruning with applications to tree-structured source coding and modelingIEEE Transactions on Information Theory, 1989
- An analogue approach to the travelling salesman problem using an elastic net methodNature, 1987
- Speech coding based upon vector quantizationIEEE Transactions on Acoustics, Speech, and Signal Processing, 1980
- A Convergence Theorem for the Fuzzy ISODATA Clustering AlgorithmsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,1980
- A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated ClustersJournal of Cybernetics, 1973
- A clustering technique for summarizing multivariate dataBehavioral Science, 1967