Linear SVM Models for Online Activity Recognition - Smartphone-Based Human Activity Recognition

Information Technology Reference

In-Depth Information

S

which includes all the indexes of the features whose weights are equal to zero for

all the classes in the following way:

m } . In that

sense, all the zero-valued weights can be removed from the FFP computation. We can

also compute the effective feature dimensionality reduction as the fraction of selected

features

S = j

| w c , j

=

, ∀

∈ {

,...,

0

c

1

d

−|S|

d

indicates the cardinality of the set. Additionally, we

can measure the feature reduction which is possible per class. This measure give

us an idea of how fast the FFP computation can be. So we also define the mean

feature dimensionality reduction as

ˁ =

, where

|S|

m c = 1 d −| S c |

1

ˁ =

, where

S c are the indexes

d

of zero-valued features per class.

6.3 L1-L2 SVM Algorithm

The conventional L2-SVMapproach is considered as one of the state-of-the-art meth-

ods for classification, and several effective techniques have been developed through-

out the years for training these models (Fan et al. 2008 ;Ghioetal. 2012 ; Keerthi

et al. 2001 ; Platt 1998 ; Shalev-Shwartz et al. 2007 ). While allowing to derive sparse

classifiers (i.e. models described by exploiting a limited subset of training patterns),

L2-SVM (Vapnik 1998 ) does not perform any feature reduction which becomes a

limitation for the analysis of the dataset and the interpretability of the informative

content of the inputs. On the other hand, L1-SVM allows to introduce in the learning

process an automatic dimensionality reduction effect. However, despite this being

very appealing for this task, L1-SVM is also characterized by some drawbacks:

1. No feature grouping effect characterizes L1 models, i.e. clusters of highly cross-

correlated inputs are usually not entirely selected by the training procedure (Segal

et al. 2003 );

2. When the dimensionality of the dataset is remarkably larger than the number of

samples, L1 models are able to exploit only a number of inputs at most equal to

the cardinality of the training set, which could be restrictive in some applications

(Zou and Hastie 2005 );

3. L1-SVM require custom ad-hoc algorithms to be developed for classifier training

(Friedman et al. 2010 ), which do not exploit the huge effort spent in the last

decades for designing effective solvers for the conventional SVM (e.g. Keerthi

et al. 2001 ; Platt 1998 ).

In order to deal with the first two points above, an SVM which combines L1- and

L2-Norms has been proposed in (Zou and Hastie 2005 ). It allows to enhance feature

grouping effects in model training, to properly balance sparsity and dimensional-

ity reduction, and to combine the effectiveness of the L2 approach and the feature

selection characteristics of L1-SVMs.

Moreover, to cope with the third issue, we present a new training tool allowing

to efficiently deal with SVMs based on L1-, L2- and L1-L2-Norms. The proposal

builds on the efficient solvers developed in the last decades for L2-SVM (e.g. Keerthi

et al. 2001 ; Platt 1998 ), and thus can be implemented with a minimal effort.

Smartphone-Based Human Activity Recognition

Search WWH ::

Custom Search

Home