Some sharp performance bounds for least squares regression with L1 regularization
Open Access
- 1 October 2009
- journal article
- Published by Institute of Mathematical Statistics in The Annals of Statistics
- Vol. 37 (5A)
- https://doi.org/10.1214/08-aos659
Abstract
We derive sharp performance bounds for least squares regression with $L_1$ regularization from parameter estimation accuracy and feature selection quality perspectives. The main result proved for $L_1$ regularization extends a similar result in [Ann. Statist. 35 (2007) 2313--2351] for the Dantzig selector. It gives an affirmative answer to an open question in [Ann. Statist. 35 (2007) 2358--2364]. Moreover, the result leads to an extended view of feature selection that allows less restrictive conditions than some recent work. Based on the theoretical insights, a novel two-stage $L_1$-regularization procedure with selective penalization is analyzed. It is shown that if the target parameter vector can be decomposed as the sum of a sparse parameter vector with large coefficients and another less sparse vector with relatively small coefficients, then the two-stage procedure can lead to improved performance.Comment: Published in at http://dx.doi.org/10.1214/08-AOS659 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
Keywords
All Related Versions
This publication has 17 references indexed in Scilit:
- The Dantzig selector and sparsity oracle inequalitiesBernoulli, 2009
- Simultaneous analysis of Lasso and Dantzig selectorThe Annals of Statistics, 2009
- A generalized Dantzig selector with shrinkage tuningBiometrika, 2009
- Lasso-type recovery of sparse representations for high-dimensional dataThe Annals of Statistics, 2009
- The sparsity and bias of the Lasso selection in high-dimensional linear regressionThe Annals of Statistics, 2008
- High-dimensional generalized linear models and the lassoThe Annals of Statistics, 2008
- Discussion: The Dantzig selector: Statistical estimation when p is much larger than nThe Annals of Statistics, 2007
- Aggregation for Gaussian regressionThe Annals of Statistics, 2007
- Sparsity oracle inequalities for the LassoElectronic Journal of Statistics, 2007
- Decoding by Linear ProgrammingIEEE Transactions on Information Theory, 2005