Linear SVM Models for Online Activity Recognition - Smartphone-Based Human Activity Recognition - page 102

Information Technology Reference

In-Depth Information

Table 6.5 Comparison of the EX-SMO algorithm training time (in hours) against SMLP and SMO

L1-L2-SVM

SVM

L1-SVM

ʻ

L2-SVM

0 . 001

0 . 005

0 . 01

0 . 05

0 . 1

0 . 5

1

Training

SMLP

EX-SMO

SMO

Time (h)

2.54

2.37

1.97

1.54

1.47

1.39

1.32

1.14

1.13

(Flannery et al. 1992 ) and L2 SVM (the conventional SMO (Keerthi et al. 2001 )).

Table 6.5 shows that EX-SMO performs comparably to SMO on L2 problems and

outstrips SMLP on training L1 SVMs, albeit EX-SMOeffectiveness tends to decrease

as

0: this is expected, as we are using a QP tool to solve an (almost) LP problem,

and this suboptimal approach leads to a slight loss in the algorithm's performance.

Table 6.6 reports the confusion matrices for L1, L1-L2 and L2 SVMs obtained on

ʻ ₒ

the

D 2 T . We do not present results for different solvers as no differences are shown

in them. Different from the expected behavior, we observe a regular classification

performance in all the methods in terms of accuracy (with variations below 1%).

However, we found an interesting result and it is that when

05 the highest

accuracy (96.91%) is achieved instead of MC-L2-SVM and MC-L1-SVM. This

finding is possibly linked to the fact that this intermediate solution is selecting relevant

features and filtering noisy ones, two aspects that cannot be properly dealt with

extreme cases such as when

ʻ =

0

.

1 respectively.

Moreover, in Table 6.7 we collect the experiment results for all the values of

ʻ ₒ

0 and when

ʻ =

ʻ

.

They include classification accuracy, dimensionality reduction (overall

ˁ

and average

ˁ

), and grouping ability

˃

. First, we can observe that

ˁ

decreases (or increases)

with

. This corroborates the dimensionality reduction capability of L1-SVM and

equivalentlyL1-L2-SVMwith small values of

ʻ

. However, though the dimensionality

reduction capability is maximized for L1-SVM, feature grouping effects, namely the

ability of the algorithm in selecting (or neglecting) clusters of highly cross-correlated

inputs, are usually absent when

ʻ

0, although they are desirable in order to have

more insights on the informative content of each input (Segal et al. 2003 ). In order

to evaluate whether L1-L2-SVM is able to overcome these L1-related issues, as

expected from literature, we computed the correlation matrix M C

ʻ ₒ

d × d of X

and we created feature clusters by joining the 10 most cross-correlated inputs. Our

purpose was to verify the percentage

∈ R

of clusters features selected (or neglected)

by the different procedures (ranging from L1-SVM to L2-SVM): a high value for

˃

˃

is obviously desirable. Results are also shown in the table and it is thus worth

noting that: a very small subset of features (L1-SVM) is necessary to guarantee

an acceptable classification performance, though grouping effects are limited. By

balancing the effects of L1 and L2 regularization terms, we can decide whether we

want a higher accuracy, a smaller number of features or a higher grouping ability.

In the particular case of HAR using smartphones, as we are targeting the mini-

mization of the computational burden to maximize battery duration and we are only

partially interested in having insights on information content of each input, L1-L2

Next Page

Smartphone-Based Human Activity Recognition

Search WWH ::

Custom Search

Home