Information Technology Reference
In-Depth Information
Algorithm 1: Group-Wise Gibbs Sampler for Support Recovery
1. Randomly select a variable
X
j
. Compute
R
j,m
=
Y
m
−
i
=
j
X
i
ʲ
i,m
,for
m
=
1
,
,M
.
2. Compute the likelihood ratio
Z
j
according to Eq. (6), and then evaluate the poste-
rior probability of
ʴ
j
···
ʸ
j
)
Z
j
(1
−
P
(
ʴ
j
=1
|
Y
,ʴ
−j
,
{
ʲ
−j,m
,m
=1
,
···
,M
}
,˃
)=
.
(7)
ʸ
j
)
Z
j
+
ʸ
j
(1
−
3. Sample
ʴ
j
based on the posterior probability in (7). If
ʴ
j
=0,thenset
ʲ
j,m
=0,
m
=1
,
N
(
r
j,m
,˃
2
j,m
).
4. After repeat above steps for all variables, compute the current residual matrix,
Res
=
···
,M
, otherwise, sample
ʲ
j,m
∼
,
(
diaq
(
Res
Res
))
/M
+
b
2
B
. Then sample
˃
2
IG
(
a
+
n
2
Y
−
X
∼
).Goto
Step 1.
3.2
Two-Layer Structure and Two-Layer Gibbs Sampler
In the group selection methods, once a variable,
X
j
, is selection, then
X
j
is active
for all the responses,
Y
1
,...,Y
M
. However, we can further assume that the selected
variable might not be active for all response vectors simultaneously. In other words, we
are interested in finding the best union of support sets,
S
, and we also assume that the
variable in
S
might be inactive for some response vectors. Therefore, unlike the single
indicator set-up in the group-wise Gibbs sampler, two nested sets of binary indicator
variables are used. The first set of indicators,
ʴ
=(
ʴ
1
,
,ʴ
p
)
is associated with
variables,
X
1
,...,X
p
, respectively, and
ʴ
j
is defined to indicate if the variable,
X
j
,
is active for any of the response vectors. Specifically if
ʴ
j
=1, then the variable
X
j
is
selected, and
ʴ
j
=0otherwise. In the second indicator set, each indicator is associated
with a variable and a response vector, indicating whether this variable is active for
explaining the particular response vector. Thus for each variable
X
j
,wedefinethe
indicator vector
ʷ
(
j
)
=(
ʷ
j,
1
,
···
,ʷ
j,M
),andif
ʷ
j,m
=1,thevariable
X
j
is active for
the
m
-th response,
Y
m
,and
ʷ
j,m
=0otherwise.
Similar to the group-wise Gibbs sampler, the prior distribution of
ʴ
j
is also assumed
to follow the Bernoulli distribution with
P
(
ʴ
j
=0)=
ʸ
j
and
P
(
ʴ
j
=1)=1
···
−
ʸ
j
,i.e.
Ber
(1
ʸ
j
). Consider the prior assumption for the second set of indicators. Following
Chen et al. (2014) [3], the prior distribution of the indicator in the second set,
ʷ
j,m
,is
chosen as a mixture distribution depended on the indicator in the first set:
ʴ
j
,andis
represented as
−
ʷ
j,m
|ʴ
j
∼
(1
− ʴ
j
)
ʳ
0
+
ʴ
j
Ber
(1
− ˁ
j,m
)
,
(8)
where
P
(
ʷ
j,m
=0)=
ˁ
j,m
.BasedonEq.(8),ifthe
j
-th variable,
X
j
, is not selected
in
S
,i.e.
ʴ
j
=0,then
ʷ
j,m
=0for all
m
=1
,...,M
,however,when
ʴ
j
=1,
ʷ
j,m
still
could be 0 or 1 due to the Bernoulli prior distribution. Then for the coefficient,
ʲ
j,m
,
given the indicators
ʴ
j
and
ʷ
j,m
, the prior distribution of
ʲ
j,m
can be defined as
ʴ
j
ʷ
j,m
)
ʳ
0
+
ʴ
j
ʷ
j,m
N
(0
,˄
j,m
)
.
ʲ
j,m
|
ʴ
j
,ʷ
j,m
∼
(1
−
(9)
Search WWH ::
Custom Search