Abstract
Although it is generally felt that the all-subsets, or “best” subset, approach is better than forward selection and backward elimination, the sequential procedures are still widely used. To see what advantage there is in doing all-subsets, this paper gtves both theoretical and empirical comparisons. It is shown that the difference in favor of all-subsets can be arbitrarily large in examples where there are predictors which do poorly alone but do very well together. Also. empirical comparisons on nine data sets show big dilrerences favoring all-subsets, when the differences are measured on the data (sample values). However, fairer comparisons based on known population values show very small differences favoring all-subsets. The only exception is the one data set which has predictors which do well together but poorly alone.