Frequency of Selecting Noise Variables in Subset Regression Analysis: A Simulation Study
- 1 February 1987
- journal article
- research article
- Published by Taylor & Francis in The American Statistician
- Vol. 41 (1) , 84-86
- https://doi.org/10.1080/00031305.1987.10475450
Abstract
This article presents the results of a simulation study of variable selection in a multiple regression context that evaluates the frequency of selecting noise variables and the bias of the adjusted R 2 of the selected variables when some of the candidate variables are authentic. It is demonstrated that for most samples a large percentage of the selected variables is noise, particularly when the number of candidate variables is large relative to the number of observations. The adjusted R 2 of the selected variables is highly inflated.Keywords
This publication has 1 reference indexed in Scilit:
- Data MiningThe Review of Economics and Statistics, 1983