ABSTRACT
We consider the least-square linear regression problem with regularization by the l1-norm, a problem usually referred to as the Lasso. In this paper, we present a detailed asymptotic analysis of model consistency of the Lasso. For various decays of the regularization parameter, we compute asymptotic equivalents of the probability of correct model selection (i.e., variable selection). For a specific rate decay, we show that the Lasso selects all the variables that should enter the model with probability tending to one exponentially fast, while it selects all other variables with strictly positive probability. We show that this property implies that if we run the Lasso for several bootstrapped replications of a given sample, then intersecting the supports of the Lasso bootstrap estimates leads to consistent model selection. This novel variable selection algorithm, referred to as the Bolasso, is compared favorably to other linear regression methods on synthetic data and datasets from the UCI machine learning repository.
- Asuncion, A., & Newman, D. (2007). UCI machine learning repository.Google Scholar
- Bach, F. R. (2008). Consistency of the group Lasso and multiple kernel learning. J. Mac. Learn. Res., to appear. Google ScholarDigital Library
- Bentkus, V. (2003). On the dependence of the Berry--Esseen bound on dimension. Journal of Statistical Planning and Inference, 113, 385--402.Google ScholarCross Ref
- Boucheron, S., Lugosi, G., & Bousquet, O. (2004). Concentration inequalities. Advanced Lectures on Machine Learning. Springer.Google Scholar
- Breiman, L. (1996a). Bagging predictors. Machine Learning, 24, 123--140. Google ScholarDigital Library
- Breiman, L. (1996b). Heuristics of instability and stabilization in model selection. Ann. Stat., 24, 2350--2383.Google ScholarCross Ref
- Breiman, L. (1998). Arcing classifier. Ann. Stat., 26, 801--849.Google ScholarCross Ref
- Bühlmann, P. (2006). Boosting for high-dimensional linear models. Ann. Stat., 34, 559--583.Google ScholarCross Ref
- Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004). Least angle regression. Ann. Stat., 32, 407.Google ScholarCross Ref
- Efron, B., & Tibshirani, R. J. (1998). An introduction to the bootstrap. Chapman & Hall.Google Scholar
- Fu, W., & Knight, K. (2000). Asymptotics for Lasso-type estimators. Ann. Stat., 28, 1356--1378.Google ScholarCross Ref
- Lounici, K. (2008). Sup-norm convergence rate and sign concentration property of Lasso and Dantzig estimators. Electronic Journal of Statistics, 2.Google Scholar
- Meinshausen, N., & Yu, B. (2008). Lasso-type recovery of sparse representations for high-dimensional data. Ann. Stat., to appear.Google Scholar
- Tibshirani, R. (1994). Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. B, 58, 267--288.Google Scholar
- Wainwright, M. J. (2006). Sharp thresholds for noisy and high-dimensional recovery of sparsity using l 1-constrained quadratic programming (Tech. report 709). Dpt. of Statistics, UC Berkeley.Google Scholar
- Yuan, M., & Lin, Y. (2007). On the non-negative garrotte estimator. J. Roy. Stat. Soc. B, 69, 143--161.Google ScholarCross Ref
- Zhao, P., & Yu, B. (2006). On model selection consistency of Lasso. J. Mac. Learn. Res., 7, 2541--2563. Google ScholarDigital Library
- Zou, H. (2006). The adaptive Lasso and its oracle properties. J. Am. Stat. Ass., 101, 1418--1429.Google ScholarCross Ref
Index Terms
Bolasso: model consistent Lasso estimation through the bootstrap
Recommendations
Image compressive sensing via Truncated Schatten-p Norm regularization
Low-rank property as a useful image prior has attracted much attention in image processing communities. Recently, a nonlocal low-rank regularization (NLR) approach toward exploiting low-rank property has shown the state-of-the-art performance in ...
Nonlocal image denoising via adaptive tensor nuclear norm minimization
Nonlocal self-similarity shows great potential in image denoising. Therefore, the denoising performance can be attained by accurately exploiting the nonlocal prior. In this paper, we model nonlocal similar patches through the multi-linear approach and ...
Image Recovery based on Local and Nonlocal Regularizations
Recently, a nonlocal low-rank regularization based compressive sensing approach (NLR) which exploits structured sparsity of similar patches has shown the state-of-the-art performance in image recovery. However, NLR cannot efficiently preserve local ...
Comments