The sparsity and bias of the Lasso selection in high-dimensional linear regression
Top Cited Papers
Open Access
- 1 August 2008
- journal article
- Published by Institute of Mathematical Statistics in The Annals of Statistics
- Vol. 36 (4) , 1567-1594
- https://doi.org/10.1214/07-aos520
Abstract
Meinshausen and Buhlmann [Ann. Statist. 34 (2006) 1436–1462] showed that, for neighborhood selection in Gaussian graphical models, under a neighborhood stability condition, the LASSO is consistent, even when the number of variables is of greater order than the sample size. Zhao and Yu [(2006) J. Machine Learning Research 7 2541–2567] formalized the neighborhood stability condition in the context of linear regression as a strong irrepresentable condition. That paper showed that under this condition, the LASSO selects exactly the set of nonzero regression coefficients, provided that these coefficients are bounded away from zero at a certain rate. In this paper, the regression coefficients outside an ideal model are assumed to be small, but not necessarily zero. Under a sparse Riesz condition on the correlation of design variables, we prove that the LASSO selects a model of the correct order of dimensionality, controls the bias of the selected model at a level determined by the contributions of small regression coefficients and threshold bias, and selects all coefficients of greater order than the bias of the selected model. Moreover, as a consequence of this rate consistency of the LASSO in model selection, it is proved that the sum of error squares for the mean response and the ℓα-loss for the regression coefficients converge at the best possible rates under the given conditions. An interesting aspect of our results is that the logarithm of the number of variables can be of the same order as the sample size for certain random dependent designs.Keywords
All Related Versions
This publication has 19 references indexed in Scilit:
- Lasso-type recovery of sparse representations for high-dimensional dataThe Annals of Statistics, 2009
- High-dimensional generalized linear models and the lassoThe Annals of Statistics, 2008
- Discussion: The Dantzig selector: Statistical estimation when p is much larger than nThe Annals of Statistics, 2007
- Sparsity oracle inequalities for the LassoElectronic Journal of Statistics, 2007
- The Adaptive Lasso and Its Oracle PropertiesJournal of the American Statistical Association, 2006
- For most large underdetermined systems of equations, the minimal 𝓁1‐norm near‐solution approximates the sparsest near‐solutionCommunications on Pure and Applied Mathematics, 2006
- Persistence in high-dimensional linear predictor selection and the virtue of overparametrizationBernoulli, 2004
- A new approach to variable selection in least squares problemsIMA Journal of Numerical Analysis, 2000
- Minimax risk overl p -balls forl p -errorProbability Theory and Related Fields, 1994
- The Smallest Eigenvalue of a Large Dimensional Wishart MatrixThe Annals of Probability, 1985