Memory-universal prediction of stationary random processes
- 1 January 1998
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Information Theory
- Vol. 44 (1) , 117-133
- https://doi.org/10.1109/18.650998
Abstract
We consider the problem of one-step-ahead prediction of a real-valued, stationary, strongly mixing random process (Xi)/sub i=-/spl infin///sup /spl infin//. The best mean-square predictor of X/sub 0/ is its conditional mean given the entire infinite past (X/sub i/)/sub i=-/spl infin///sup -1/. Given a sequence of observations X/sub 1/, X/sub 2/, X/sub N/, we propose estimators for the conditional mean based on sequences of parametric models of increasing memory and of increasing dimension, for example, neural networks and Legendre polynomials. The proposed estimators select both the model memory and the model dimension, in a data-driven fashion, by minimizing certain complexity regularized least squares criteria. When the underlying predictor function has a finite memory, we establish that the proposed estimators are memory-universal: the proposed estimators, which do not know the true memory, deliver the same statistical performance (rates of integrated mean-squared error) as that delivered by estimators that know the true memory. Furthermore, when the underlying predictor function does not have a finite memory, we establish that the estimator based on Legendre polynomials is consistent.Keywords
This publication has 43 references indexed in Scilit:
- On robust tracking with non-linear model predictive controlInternational Journal of Control, 2002
- Nonparametric inference for ergodic, stationary time seriesThe Annals of Statistics, 1996
- Minimum complexity regression estimation with weakly dependent observationsIEEE Transactions on Information Theory, 1996
- Concept learning using complexity regularizationIEEE Transactions on Information Theory, 1996
- Sup-norm approximation bounds for networks through probabilistic methodsIEEE Transactions on Information Theory, 1995
- Convergence rates for single hidden layer feedforward networksNeural Networks, 1994
- Fully vector-quantized neural network-based code-excited nonlinear predictive speech codingIEEE Transactions on Speech and Audio Processing, 1994
- Bispectrum signal processing on HNC's SIMD numerical array processor (SNAP)Published by Association for Computing Machinery (ACM) ,1993
- Complexity of strings in the class of Markov sourcesIEEE Transactions on Information Theory, 1986
- A universal data compression systemIEEE Transactions on Information Theory, 1983