Signature Selection for Grouped Features with a Case Study on Exon Microarrays - Feature Selection for Data and Pattern Recognition

Information Technology Reference

In-Depth Information

log 1

exp

y i

+ ʲ 0

x i

( ʳ ,ʲ 0 ) :=

−

| k − 1

≤ s = 1 |

Finally, we define new groups H k

:= {

1 |

G s |+

≤

G s |}

for

K . Then the optimization problem for overlapping group lasso can be

written in the same form as that of group lasso in ( 14.4 ),

,...,

( ʳ ,ʲ 0 ) + ʻ

1 ʳ H k 2 .

min

ʳ ,ʲ 0

Therefore, the same algorithm for group lasso can be applied for overlapping group

lasso.

14.2.3 Sparse Group Lasso

In group lasso (Sect. 14.2.1 ) and overlapping group lasso (Sect. 14.2.2 ), the coefficient

subvectors corresponding to selected groups are not necessarily sparse, as briefly dis-

cussed in Sect. 14.2.1.1 . This would be undesirable when we try to identify only a

few important features inside selected groups.

An alternative is the sparse group lasso proposed by [ 22 ]. It tries to induce spar-

sity in coefficients not only groupwise but also within groups, by using a modified

formulation of group lasso in ( 14.4 ):

min

ʲ ,ʲ 0

( ʲ ,ʲ 0 ) + ʻ { ʱ ʲ 1 + (

− ʱ)ʨ G ( ʲ ) } .

(14.9)

Here

ʱ ∈[

]

is a parameter that determines the mixing of the

1 and the group

(

1 / 2 ) norms, similarly to a constant in elastic net ( 14.3 ). Overlapping group

lasso discussed in Sect. 14.2.2 can be extended in a similar fashion, by augmenting

its objective function with an

ʨ G ,

1 term.

14.2.3.1 Computation of Solutions

Similarly to group lasso, sparse group lasso in ( 14.9 ) defines a convex minimiza-

tion problem for which sufficient and necessary conditions can be characterized by

optimality conditions.

To understand the optimality conditions, we need the notion of subgradient .For

a function h

R p

( ʲ )

, g

∈ R

is a subgradient for h at

ʲ ∈

( ʲ ) ≥

g T

( ʲ − ʲ ), ∀ ʲ ∈ R

( ʲ ) +

p , it has a subgradient at every point in

p . For example,

If h is a convex function on

( ʲ ) = ʲ 1 is a convex function and its subgradient g at

is defined by

Feature Selection for Data and Pattern Recognition

Search WWH ::

Custom Search

Home