Linear SVM Models for Online Activity Recognition - Smartphone-Based Human Activity Recognition

Information Technology Reference

In-Depth Information

order to solve our multiclass HAR problem. In a similar way, we present in Sect. 6.3

the combined algorithm (MultiClass L1-L2-SVM (MC-L1-L2-SVM)) that allows

to merge the effectiveness of L2 models and the feature selection characteristics

of L1 solutions for HAR. Moreover, we describe our proposed training algorithm

(Extended SMO (EX-SMO)). Experimental results regarding the proposed SVM

approaches, addition of gyroscopes and feature selection mechanisms are presented

in Sect. 6.4 . Finally we summarize the chapter in Sect. 6.5 .

6.2 L1-Norm and L2-Norm SVMs for Activity Recognition

Our target is to design a model, which can be effectively run on smartphones with

limited battery life and computational restrictions. We have thus to identify the sim-

plest possible classifier exploiting the smallest set of features that guarantees the best

performance/computational burden ratio. For these purposes, we peruse the exploita-

tion of linear models, which use only those selected inputs that are crucial to attain

sufficient classification accuracy. In this section we formulate the OVA SVMs with

the L1- and L2-Norms.

In the framework of supervised learning and in the case of binary classification

problems, the goal is to approximate the relationship between examples from a set

d elements and a set

composed of x i

This relationship is encapsulated by a fixed, but unknown, probability distribution

∈ R

which contains outputs targets y i

=±

= { (

x 1 ,

y 1 ),...,(

x n ,

y n ) }

. A training set D n

is sampled according to

.The

learning algorithm maps D n to f

∈ F

with a linear separator in the original space

T x

(

) = w

b . Moreover, the accuracy in representing the hidden relationship

is measured with reference to a loss function

(

)

) = 1

)) /

2 seems the

most natural choice, as it counts the number of misclassifications, but unfortunately it

is non-convex. For this reason the hinge loss function

In general, the hard loss function

H (

(

−

y sign

(

ʾ (

(

) =

−

(

)

]

is exploited instead (Vapnik 1998 ). It is possible to introduce a regularization term

in order to a djust the size of the class. In this case, we choose the Euclidean norm

(

j = 1 w

w 2 =

j ), also known as the L2-Norm (Tikhonov and Arsenin 1978 ).

According to the SRM principle (Vapnik 1998 ), we can derive, similarly to Eq. ( 2.2 ) ,

the primal formulation using the L2-Norm in the minimization problem:

2 w

C 1 n ʾ ,

min

w ,

2 +

s.t. Y

(

w +

b n ) ≥

1 n − ʾ , ʾ ≥

0 n ,

(6.1)

, ʾ

x n ] T , y

y n ] T , Y

where

ʾ i

= ʾ (

(

x i ),

y i )

, X

[ x 1 | ... |

[ y 1 | ... |

diag

(

)

( Y is a diagonal matrix where the element on the diagonal are the y i ∈{ 1 ,..., n }

Also by introducing n Lagrange multipliers ʱ we can obtain the dual formulation

(Eq. ( 2.13 ) ) which is a CCQP problem and can be solved through many efficient

Smartphone-Based Human Activity Recognition

Search WWH ::

Custom Search

Home