Signature Selection for Grouped Features with a Case Study on Exon Microarrays - Feature Selection for Data and Pattern Recognition

Information Technology Reference

In-Depth Information

LASSO

The regularized regression discussed in Sect. 14.1.1 is used to select individual fea-

tures, with the lasso regularizer. The glmnet R package [ 10 ] was used for our

experiments.

GL

The group lasso algorithm discussed in Sect. 14.2.1 is used to select grouped features.

For solving group lasso problems, the grplasso [ 16 ]orthe SGL [ 22 ] R packages

can be used. The latter is designed for sparse group lasso, but it can solve group

lasso problems by specifying the parameter

ʱ =

0 so that the sparse group lasso

formulation in ( 14.9 ) will be optimized without an

1 term.Weusedthe SGL package

for our experiments.

SGL

The sparse group lasso discussed in Sect. 14.2.3 is used to perform both groupwise

and within-group individual feature selection. For analysis we use the SGL package.

Note that the parameter

ʱ

in ( 14.9 ) can be chosen to solve the lasso problem by

setting

can be

determined for instance by cross validation, searching on a two dimensional grid for

both

ʱ =

1, or the group lasso problem by

ʱ =

0. An optimal value of

ʱ

ʻ

and

ʱ

. For the purpose of demonstration, we used a fixed value

ʱ =

0

.

95.

14.3.3 Comparison of Performance

14.3.3.1 Prediction Performance

Since the entire data set is rather small ( n

92), instead of dividing the set into a train-

ing and a test set once, we performed random subsampling: we repeated the process

of choosing 70% of random patient indices (without replacement) for training and

taking the rest for testing. For each trial, we measured the prediction performance

on a test set of the predictor obtained with a training set.

Figure 14.3 shows the AUC (area under the curve) [ 11 , 20 ] scores from 20 random

subsampling trials. The AUC score (left panel) is improved by performing grouped

selection (GL and SGL), compared to the individual selection (LASSO). However,

grouped selection resulted in choosing larger number of features than LASSO (right

panel). Less number of features were chosen by SGL compared to GL as expected,

but with sacrificing a small portion of prediction performance.

In fact, the number of selected features is closely related to the cost of clinical

tests built upon the chosen features. All numbers were relatively small (

=

100) in

our case, however some would prefer a smaller number of features to reduce cost if

degradation in prediction would not be significant. In this regard, SGL in Fig. 14.3

seemed to provide a good compromise between the number of features and prediction

performance.

<

Feature Selection for Data and Pattern Recognition

Search WWH ::

Custom Search

Home