How Far are Automatically Chosen Regression Smoothing Parameters from their Optimum?
- 1 March 1988
- journal article
- research article
- Published by Taylor & Francis in Journal of the American Statistical Association
- Vol. 83 (401) , 86-95
- https://doi.org/10.1080/01621459.1988.10478568
Abstract
We address the problem of smoothing parameter selection for nonparametric curve estimators in the specific context of kernel regression estimation. Call the “optimal bandwidth” the minimizer of the average squared error. We consider several automatically selected bandwidths that approximate the optimum. How far are the automatically selected bandwidths from the optimum? The answer is studied theoretically and through simulations. The theoretical results include a central limit theorem that quantifies the convergence rate and gives the differences asymptotic distribution. The convergence rate turns out to be excruciatingly slow. This is not too disappointing, because this rate is of the same order as the convergence rate of the difference between the minimizers of the average squared error and the mean average squared error. In some simulations by John Rice, the selectors considered here performed quite differently from each other. We anticipated that these differences would be reflected in different asymptotic distributions for the various selectors. It is surprising that all of the selectors have the same limiting normal distribution. To provide insight into the gap between our theoretical results and these simulations, we did a further Monte Carlo study. Our simulations support the theoretical results, and suggest that the differences observed by Rice seemed to be principally due to the choice of a very small error standard deviation and the choice of error criterion. In the example considered here, the asymptotic normality result describes the empirical distribution of the automatically chosen bandwidths quite well, even for small samples.Keywords
This publication has 24 references indexed in Scilit:
- Biased and Unbiased Cross-Validation in Density EstimationJournal of the American Statistical Association, 1987
- Asymptotic Optimality for $C_p, C_L$, Cross-Validation and Generalized Cross-Validation: Discrete Index SetThe Annals of Statistics, 1987
- $U$-Processes: Rates of ConvergenceThe Annals of Statistics, 1987
- On the Amount of Noise Inherent in Bandwidth Selection for a Kernel Density EstimatorThe Annals of Statistics, 1987
- From Stein's Unbiased Risk Estimates to the Method of Generalized Cross ValidationThe Annals of Statistics, 1985
- A Comparison of GCV and GML for Choosing the Smoothing Parameter in the Generalized Spline Smoothing ProblemThe Annals of Statistics, 1985
- Asymptotic nonequivalence of some bandwidth selectors in nonparametric regressionBiometrika, 1985
- Kernel estimation of regression functionsPublished by Springer Nature ,1979
- A new look at the statistical model identificationIEEE Transactions on Automatic Control, 1974
- Curve EstimatesThe Annals of Mathematical Statistics, 1971