Information Technology Reference
In-Depth Information
Before discussing the details of the topic, we simplify the expression of
f
by
defining a collection of responses
y
y
1
y
2
y
n
T
n
and a collection of
=
(
,
,...,
)
∈ R
x
1
x
2
x
n
T
R
n
×
p
. Omitting the bias
input vectors as rows in a matrix
X
=
(
,
,...,
)
∈
term
ʲ
0
for further simplification, we can rewrite the expression in (
14.2
)as,
1
2
2
f
(
ʲ
)
=
y
−
X
ʲ
2
.
(14.5)
We denote the column submatrix of
X
corresponding to a group
G
k
by
X
G
k
.Asin
[
24
], we further assume that the columns in
X
G
k
are orthonormal, that is,
X
G
k
X
G
k
=
I
|
G
k
|
, an identity matrix of dimension
. Then the group lasso problem
formulation in (
14.4
) defines a convex minimization problem for which solutions can
be characterized in a closed form by first-order optimality conditions, as shown in the
following theorem. The theorem is from [
24
], except for that our proof provides more
rigorous arguments on how optimality conditions lead to the conclusion, compared
to the original proof.
|
G
k
|×|
G
k
|
1
2
2
Theorem 1
For a group lasso problem defined with f
(
ʲ
)
=
2
y
−
X
ʲ
and
X
G
k
X
G
k
=
I
|
G
k
|
for k
=
1
,
2
,...,
K , an optimal solution
ʲ
is given by
1
ʻ
ʲ
G
k
=
−
s
k
,
k
=
1
,
2
,...,
K
,
s
k
2
+
X
G
k
y
ʲ
−
G
k
,
where
s
k
:=
−
X
ʲ
−
G
k
is a vector with the same elements as
ʲ
except
for having zeros in the components at j
∈
G
k
, and
(
·
)
+
:=
max
(
0
,
·
)
.
Proof
For
0
the objective function of (
14.4
) is differentiable, and therefore
the first-order optimality condition is,
ʲ
G
k
=
ʲ
G
k
ʲ
G
k
2
=
X
G
k
(
−
y
−
X
ʲ
)
+
ʻ
0
,
Since
X
G
k
X
G
k
=
I
, this is equal to
|
G
k
|
ʻ
ʲ
G
k
2
+
1
X
G
k
y
ʲ
−
G
k
+
−
−
X
ʲ
G
k
=
0
.
(14.6)
Suppose that
ʲ
G
k
=
0
is optimal. We can consider the gradients of
f
(
ʲ
)
+
ʨ
G
(
ʲ
)
at
T
two nonzero points,
ʲ
+
ʷ
and
ʲ
−
ʷ
where
ʷ
:=
(
0
,...,
0
,ʵ,...,ʵ,
0
,...,
0
)
is
a
p
dimensional vector with
ʵ>
0 at components
j
∈
G
k
. Since
ʲ
G
k
=
0
is optimal,
the two gradients must satisfy
1
1
ʻ
ʵ
√
|
ʻ
ʵ
√
|
−
s
k
+
G
k
|
+
ʷ
G
k
>
0
,
−
s
k
−
G
k
|
+
ʷ
G
k
<
0
.
Search WWH ::
Custom Search