Optimal causal inference: Estimating stored information and approximating causal architecture
- 1 September 2010
- journal article
- Published by AIP Publishing in Chaos: An Interdisciplinary Journal of Nonlinear Science
- Vol. 20 (3) , 037111
- https://doi.org/10.1063/1.3489885
Abstract
We introduce an approach to inferring the causal architecture of stochastic dynamical systems that extends rate-distortion theory to use causal shielding—a natural principle of learning. We study two distinct cases of causal inference: optimal causal filtering and optimal causal estimation. Filtering corresponds to the ideal case in which the probability distribution of measurement sequences is known, giving a principled method to approximate a system’s causal structure at a desired level of representation. We show that in the limit in which a model-complexity constraint is relaxed, filtering finds the exact causal architecture of a stochastic dynamical system, known as the causal-state partition. From this, one can estimate the amount of historical information the process stores. More generally, causal filtering finds a graded model-complexity hierarchy of approximations to the causal architecture. Abrupt changes in the hierarchy, as a function of approximation, capture distinct scales of structural organization. For nonideal cases with finite data, we show how the correct number of the underlying causal states can be found by optimal causal estimation. A previously derived model-complexity control term allows us to correct for the effect of statistical fluctuations in probability estimates and thereby avoid overfitting.Keywords
All Related Versions
This publication has 27 references indexed in Scilit:
- Synchronization and control in intrinsic and designed computation: An information-theoretic analysis of competing models of stochastic computationChaos: An Interdisciplinary Journal of Nonlinear Science, 2010
- Time’s Barbed Arrow: Irreversibility, Crypticity, and Stored InformationPhysical Review Letters, 2009
- Information-theoretic approach to interactive learningEurophysics Letters, 2009
- Inferring Markov chains: Bayesian estimation, model comparison, entropy rate, and out-of-class modelingPhysical Review E, 2007
- How Many Clusters? An Information-Theoretic PerspectiveNeural Computation, 2004
- Thermodynamic depth of causal states: Objective complexity via minimal representationsPhysical Review E, 1999
- Inferring statistical complexityPhysical Review Letters, 1989
- Geometry from a Time SeriesPhysical Review Letters, 1980
- Finitary Codings and Weak Bernoulli PartitionsProceedings of the American Mathematical Society, 1979
- Computation of channel capacity and rate-distortion functionsIEEE Transactions on Information Theory, 1972