Bayesian Variable Selection for Multi-response Linear Regression - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

matrix of the unknown regression coefficients, and W =[ ˉ 1 ,

···

,ˉ M ]

∈ R

n×M is the

corresponding noise matrix. Here the error term ˉ m is an n

1 noise vector that follows

a multivariate normal distribution with zero mean vector and covariance matrix ˃ 2 I n ,

where I n is the n -dimensional identify matrix. Thus a group of M response vectors are

to be regressed on the same design matrix

. The model can also be written as

ʲ m ,˃ 2 I n ) ,m =1 ,

Y m =

ʲ m + ˉ m ∼

N n (

···

,M,

(2)

,ʲ p,m ) is the coefficient vector for the m -th response vector

Y m . Then the estimation of each column of B , ʲ m , is a single linear regression problem

with response vector Y m and design matrix

where ʲ m =( ʲ 1 ,m ,

···

, and can be solved individually. However,

in this paper, we solve the M individual regression problems together by exploiting the

similarities among ʲ m , or by imposing constraints on the matrix B .

In particular, we are interested in the variable selection problem for th multi-response

model (2). Suppose S m is the support set for the m -th response vector, i.e.

S m =

{

∈{

1 ,

···

ʲ j,m

}

(3)

In some applications, S m should be the same or similar for different m . Thus it is more

benefit to identify the set of variables which are related to any of the multiple response

vectors simultaneously than to identify S m separately. Thus similar to Obozinski et al.

(2011) [10], we target the “support union recovery” problem, i.e., we want to recover

the union of the support sets, i.e.

S =

S m = ( j, m )

} .

ʲ j,m

=0 ,j

∈{

1 ,

···

}

∈{

1 ,

···

In this paper, a Bayesian approach is adopted and the corresponding Bayesian algo-

rithms are proposed to recover the unknown support set S .

Bayesian Methods for Support Recovery

3.1

Group-Wise Gibbs Sampler

In support union recover problem, Obozinski et al. (2011) [10] set the group structure

for each variable across multiple response vectors, and the group Lasso approach was

adopted. Consider the corresponding Bayesian approach. It is straightforward to apply

Bayesian group selection algorithm to replace the group Lasso approach. Thus one set

of the indicators is defined to denote whether X j is active or not. Similar to group

Lasso, we want to select the “best” subset of variables from X 1 ,

···

,X p to explain the

multiple responses Y 1 ,...,Y M simultaneously.

First, following SSVS in George and McCulloch (1993) [7], a p

1 vector of indica-

tor variables, ʴ =( ʴ 1 ,...,ʴ p ) , is introduced to indicate which variables are selected.

It is defined as:

ʴ j = 1 , if X j is selected or active

0 , if X j is not selected or inactive j =1 ,

···

(4)

Technologies and Applications of Artificial Intelligence

Search WWH ::

Custom Search

Home