Linear SVM Models for Online Activity Recognition - Smartphone-Based Human Activity Recognition

Information Technology Reference

In-Depth Information

techniques previously been proposed in the literature (Shawe-Taylor and Sun 2011 ).

For example, the well-known and widely used SMO (Keerthi et al. 2001 ; Platt 1998 )

which consists in iteratively updating the two

ʱ i ∈{ 1 ,..., n }

that mostly violate the KKT

conditions until convergence is reached.

The L2-SVMmethod described above does not performany dimensionality reduc-

tion. This is, however, not desirable in some practical applications where it is needed

to highlight relevant features, discard noisy ones, and reduce the computational bur-

den of performing the classification of new samples. For this issue, the replacement

of the L2-Norm termwith the Manhattan distance or L1-Norm (

w 1 = j = 1 w j )

has been proposed (Tibshirani 1996 ):

C 1 n ʾ ,

min

w ,

w 1 +

s.t. Y

(

w +

b n ) ≥

1 n − ʾ , ʾ ≥

0 n ,

(6.2)

, ʾ

This L1 regularization introduces an automatic feature selection effect which

is embedded in the learning process. The optimization problem using L1-Norm is

formulated in the following way:

1 d w + + w − +

C 1 n ʾ

s.t. Y X w + − w − +

min

w + , w − ,

, ʾ

(6.3)

b n ≥

0 n , w + , w − ≥

1 n − ʾ , ʾ ≥

0 d ,

w j ) to become zero

which is useful for our needs as the calculations required to perform the FFP are

directly related to the number of weights different from zero. The disadvantage of

this approach is that effective tools developed for the conventional L2-SVM cannot

be applied for solving it.

Note also that the conventional kernel trick cannot be exploited in the previous

formulation, thus the effectiveness of linear models assumes an ever greater impor-

tance. This problem is a standard Linear Programming problem, for which many

tools have been developed throughout the years (Press et al. 2007 ).

L1-SVM allows to perform dimensionality reduction thanks to the characteristics

of the Manhattan norm exploited, that is several weights

The solution to this problem provokes many of the weights (

w j will be (generally)

nullified during the learning procedure: this is in contrast with the conventional L2-

SVM, where

d in the final model.

The extension of the L1-SVM binary problem into multiple classes for our appli-

cation can be achieved using a OVA approach (MultiClass L1-SVM (MC-L1-SVM))

as we also did in Chap. 5 but with some modifications in order to take advantage of

the intrinsic feature selection mechanisms of the model. The final classification is

carried out by contemplating the output of m linear models:

w j

∀

,...,

f c (

) = w

b c ,

∀

∈ {

,...,

}

(6.4)

Each binary model is learned independently and not all the zero-valued weights

for one class, are also zero for the others. For such purposes, we can define the set

Smartphone-Based Human Activity Recognition

Search WWH ::

Custom Search

Home