The consistency of the BIC Markov order estimator
Open Access
- 1 December 2000
- journal article
- Published by Institute of Mathematical Statistics in The Annals of Statistics
- Vol. 28 (6) , 1601-1619
- https://doi.org/10.1214/aos/1015957472
Abstract
The Bayesian Information Criterion (BIC) estimates the order of a Markov chain (with finite alphabet $A$) from observation of a sample path $x_1, x_2,\dots, x_n$, as that value $k = \hat{k}$ that minimizes the sum of the negative logarithm of the $k$th order maximum likelihood and the penalty term $\frac{|A|^k(|A|-1)}{2}\log n$. We show that $\hat{k}$ equals the correct order of the chain, eventually almost surely as $n \rightarrow \infty$, thereby strengthening earlier consistency results that assumed an apriori bound on the order. A key tool is a strong ratio-typicality result for Markov sample paths.We also show that the Bayesian estimator or minimum description length estimator, of which the BIC estimator is regarded as an approximation, fails to be consistent for the uniformly distributed i.i.d. process.
Keywords
This publication has 15 references indexed in Scilit:
- The minimum description length principle in coding and modelingIEEE Transactions on Information Theory, 1998
- Large deviations and the Bayesian estimation of higher-order Markov transition functionsJournal of Applied Probability, 1996
- The optimal error exponent for Markov order estimationIEEE Transactions on Information Theory, 1996
- The context-tree weighting method: basic propertiesIEEE Transactions on Information Theory, 1995
- A universal finite memory sourceIEEE Transactions on Information Theory, 1995
- Entropy and the Consistent Estimation of Joint DistributionsThe Annals of Probability, 1994
- Nonparametric Binary Regression: A Bayesian ApproachThe Annals of Statistics, 1993
- On the Choice of a Model to Fit Data from an Exponential FamilyThe Annals of Statistics, 1988
- The performance of universal encodingIEEE Transactions on Information Theory, 1981
- Estimating the Dimension of a ModelThe Annals of Statistics, 1978