Information geometry on hierarchy of probability distributions
Top Cited Papers
- 1 July 2001
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Information Theory
- Vol. 47 (5) , 1701-1711
- https://doi.org/10.1109/18.930911
Abstract
An exponential family or mixture family of probability distributions has a natural hierarchical structure. This paper gives an "orthogonal" decomposition of such a system based on information geometry. A typical example is the decomposition of stochastic dependency among a number of random variables. In general, they have a complex structure of dependencies. Pairwise dependency is easily represented by correlation, but it is more difficult to measure effects of pure triplewise or higher order interactions (dependencies) among these variables. Stochastic dependency is decomposed quantitatively into an "orthogonal" sum of pairwise, triplewise, and further higher order dependencies. This gives a new invariant decomposition of joint entropy. This problem is important for extracting intrinsic interactions in firing patterns of an ensemble of neurons and for estimating its functional connections. The orthogonal decomposition is given in a wide class of hierarchical structures including both exponential and mixture families. As an example, we decompose the dependency in a higher order Markov chain into a sum of those in various lower order Markov chains.Keywords
This publication has 28 references indexed in Scilit:
- The Exponential Statistical Manifold: Mean Parameters, Orthogonality and Space TransformationsBernoulli, 1999
- Statistical inference under multiterminal data compressionIEEE Transactions on Information Theory, 1998
- An Infinite-Dimensional Geometric Structure on the Space of all the Probability Measures Equivalent to a Given OneThe Annals of Statistics, 1995
- Information geometry of the EM and em algorithms for neural networksNeural Networks, 1995
- Information geometry of Boltzmann machinesIEEE Transactions on Neural Networks, 1992
- Statistical inference under multiterminal rate restrictions: a differential geometric approachIEEE Transactions on Information Theory, 1989
- Differential geometry of a parametric family of invertible linear systems—Riemannian metric, dual affine connections, and divergenceTheory of Computing Systems, 1987
- Conditional limit theorems under Markov conditioningIEEE Transactions on Information Theory, 1987
- The relation between information theory and the differential geometry approach to statisticsInformation Sciences, 1985
- $I$-Divergence Geometry of Probability Distributions and Minimization ProblemsThe Annals of Probability, 1975