Some sharp performance bounds for least squares regression with L1 regularization

Open Access

1 October 2009

journal article
Published by Institute of Mathematical Statistics in The Annals of Statistics

Vol. 37 (5A)
https://doi.org/10.1214/08-aos659

Abstract

We derive sharp performance bounds for least squares regression with $L_1$ regularization from parameter estimation accuracy and feature selection quality perspectives. The main result proved for $L_1$ regularization extends a similar result in [Ann. Statist. 35 (2007) 2313--2351] for the Dantzig selector. It gives an affirmative answer to an open question in [Ann. Statist. 35 (2007) 2358--2364]. Moreover, the result leads to an extended view of feature selection that allows less restrictive conditions than some recent work. Based on the theoretical insights, a novel two-stage $L_1$-regularization procedure with selective penalization is analyzed. It is shown that if the target parameter vector can be decomposed as the sum of a sparse parameter vector with large coefficients and another less sparse vector with relatively small coefficients, then the two-stage procedure can lead to improved performance.Comment: Published in at http://dx.doi.org/10.1214/08-AOS659 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

Keywords

All Related Versions

Version 1, 2009-08-20, ArXiv

This publication has 17 references indexed in Scilit:

The Dantzig selector and sparsity oracle inequalities
Bernoulli, 2009
Simultaneous analysis of Lasso and Dantzig selector
The Annals of Statistics, 2009
A generalized Dantzig selector with shrinkage tuning
Biometrika, 2009
Lasso-type recovery of sparse representations for high-dimensional data
The Annals of Statistics, 2009
The sparsity and bias of the Lasso selection in high-dimensional linear regression
The Annals of Statistics, 2008
High-dimensional generalized linear models and the lasso
The Annals of Statistics, 2008
Discussion: The Dantzig selector: Statistical estimation when p is much larger than n
The Annals of Statistics, 2007
Aggregation for Gaussian regression
The Annals of Statistics, 2007
Sparsity oracle inequalities for the Lasso
Electronic Journal of Statistics, 2007
Decoding by Linear Programming
IEEE Transactions on Information Theory, 2005