How Far are Automatically Chosen Regression Smoothing Parameters from their Optimum?

1 March 1988

journal article
research article
Published by Taylor & Francis in Journal of the American Statistical Association

Vol. 83 (401) , 86-95
https://doi.org/10.1080/01621459.1988.10478568

Abstract

We address the problem of smoothing parameter selection for nonparametric curve estimators in the specific context of kernel regression estimation. Call the “optimal bandwidth” the minimizer of the average squared error. We consider several automatically selected bandwidths that approximate the optimum. How far are the automatically selected bandwidths from the optimum? The answer is studied theoretically and through simulations. The theoretical results include a central limit theorem that quantifies the convergence rate and gives the differences asymptotic distribution. The convergence rate turns out to be excruciatingly slow. This is not too disappointing, because this rate is of the same order as the convergence rate of the difference between the minimizers of the average squared error and the mean average squared error. In some simulations by John Rice, the selectors considered here performed quite differently from each other. We anticipated that these differences would be reflected in different asymptotic distributions for the various selectors. It is surprising that all of the selectors have the same limiting normal distribution. To provide insight into the gap between our theoretical results and these simulations, we did a further Monte Carlo study. Our simulations support the theoretical results, and suggest that the differences observed by Rice seemed to be principally due to the choice of a very small error standard deviation and the choice of error criterion. In the example considered here, the asymptotic normality result describes the empirical distribution of the automatically chosen bandwidths quite well, even for small samples.

Keywords

This publication has 24 references indexed in Scilit:

Biased and Unbiased Cross-Validation in Density Estimation
Journal of the American Statistical Association, 1987
Asymptotic Optimality for $C_p, C_L$, Cross-Validation and Generalized Cross-Validation: Discrete Index Set
The Annals of Statistics, 1987
$U$-Processes: Rates of Convergence
The Annals of Statistics, 1987
On the Amount of Noise Inherent in Bandwidth Selection for a Kernel Density Estimator
The Annals of Statistics, 1987
From Stein's Unbiased Risk Estimates to the Method of Generalized Cross Validation
The Annals of Statistics, 1985
A Comparison of GCV and GML for Choosing the Smoothing Parameter in the Generalized Spline Smoothing Problem
The Annals of Statistics, 1985
Asymptotic nonequivalence of some bandwidth selectors in nonparametric regression
Biometrika, 1985
Kernel estimation of regression functions
Published by Springer Nature ,1979
A new look at the statistical model identification
IEEE Transactions on Automatic Control, 1974
Curve Estimates
The Annals of Mathematical Statistics, 1971