Information Technology Reference
In-Depth Information
As we discussed, in group lasso it is not allowed to select a group which has any
shared feature with a group that is not selected. That is, feature indices in selected
groups cannot belong to any of unselected groups, as stated in the expression above.
For an example, consider three groups G 1 ={
1
,
2
,
4
}
, G 2 ={
2
,
3
}
and G 3 ={
4
,
5
}
with p
5. Suppose that G 2 and G 3 are not selected. Then the only possible option
is to select the 1st feature, that is, the set of candidate features is
=
c .
{
1
}ↂ (
G 2
G 3 )
On the other hand, in case of
ʨ O for overlapping groups,
{
j
: ʲ j =
0
}ↂ
G k .
k
k
: ʳ
=
0
This is indeed obvious since
ʨ O allows for selecting any group whenever its asso-
k vector is nonzero. In the example above with three groups, we can select
any combination of G 1 , G 2 and G 3 .
Note that when groups define a partition of features, that is, there is no overlap
amongst groups, then the two expressions above become the same. Therefore it is
advised to apply group lasso in Sect. 14.2.1 only if there is no overlap, or if there
exists overlap but the set of features that can be selected fits particular purposes.
ciated
ʳ
14.2.2.2 Reformulation to Group Lasso
In the definition of the overlapping group lasso regularizer in ( 14.8 ), the coefficient
vector
k vectors over group indices k
ʲ
is expressed as the summation of all
ʳ
=
1
,
2
,...,
K , that is,
K
k
ʲ =
1 ʳ
.
k =
This result can be used in combination with the loss function f
( ʲ 0 )
discussed in
Sect. 14.1.1.1 . Taking the loss function for logistic regression, we get
log 1
exp
y i K
n
1
k
k
T x i
f
( ʳ
,..., ʳ
0 ) :=
+
1 ( ʳ
)
+ ʲ 0
.
i
=
1
k
=
This can be further simplified by constructing two new vectors,
T T
1
T
2
T
k
ʳ :=
( ʳ
G 1 )
,( ʳ
G 2 )
,...,( ʳ
G K )
,
T T
x i
x i G 1 )
T
x i G 2 )
T
x i G K )
˜
:=
(
,(
,...,(
.
x i is a copy
of an input vector x i with features replicated if they belong to multiple groups. Then
the loss function can be rewritten as,
k
K
k =
Here
ʳ
is a collection of nonzero components in
{ ʳ
}
1 vectors, and
˜
Search WWH ::




Custom Search