-

Correlated Variable Selection in High-dimensional Linear Models using Dual Polytope Pro jection

Niharika Gauraha

Swapan K. Parui

0 0 Indian Statistical Institute , India

[1] Robert Tibshirani. \Regression shrinkage and selection via the lasso". In: J. R. Statist. Soc 58 (1996), 267{288. [2] Jie Wang et al. \Lasso Screening Rules via Dual Polytope Projection". In: NIPS (2013).

Correlated Variable Selection Lasso Dual Polytope Projection High-dimensional Data Analysis

We consider the case of high dimensional linear models (p n) with strong empirical correlation among variables. The Lasso is a widely used regularized regression method for variable selection, but it tends to select a single variable from a group of strongly correlated variables even if many or all of these variables are important. In many situations, it is desirable to identify all the relevant correlated variables, examples include micro-array analysis and genome-wide association studies. We propose to use Dual Polytope Projections (DPP) rule, for selecting the relevant correlated variables which are not selected by the Lasso.

We consider the usual linear model setup, that is given as Y = X + : Let 0 be a regularization parameter. Then the Lasso estimator(see [1]) is de ned 1 as: ^( ) = m2iRnp 2 kY X kj22 + k k1 : Let max = 1mjaxp jXjT Yj, then for all 2 [ max; 1), we have ^( ) = 0. It has been shown that the screening methods based on DPP rule are highly e ective in reducing the dimensionality by discarding the irrelevant variables (see [2]). Suppose we want to compute Lasso solution for a 2 (0; max), the (global strong) DPP rule discards the jth variable whenever jXjT Yj < 2 max (variables having smaller inner products with the response).

Exploiting the above property, we propose a two-stage procedure for variable selection. At the rst stage, we perform Lasso using cross-validation and we choose the regularization parameter Lasso, that optimizes the prediction. At the second stage, we select all the variables for which jXjT Yj 2 Lasso max: Though, the Lasso solution at Lasso does not include all the relevant correlated variables, but these correlated variables have the similar magnitude for their inner products with the response. Hence, all the relevant correlated predictors also get selected at the second stage.