Information Technology Reference
In-Depth Information
One of the main drawbacks of L1-SVMs consists in the impossibility of resorting
to non-linearity to improve classification accuracy through the kernel trick (Vapnik
1998 ): for targeting this issue, here we compare the use of conventional non-linear
models with the one of linear classifiers in the particular case of HAR, showing how
the latter ones are characterized by similar performance/complexity ratios than the
former ones (Sect. 6.4.1 ). The HAR Dataset (
D 2 ), used in the forthcoming analysis,
is composed of a large set of time and the frequency domain features extracted from
the accelerometer and gyroscope signals and also subsets of these, such as
D 2 T ,
which only include time domain features.
6.4.1 Linear Versus Non-Linear SVMs
The first experiment aimed to compare the performance of SVMmodels based linear
and non-linear kernels. For that, we performed the training of
D 2 using two models:
first, the standard OVA MC-GK-SVM K
exp
2 .For
x i
x j
2
(
x i ,
x j ) =
ʳ
model selection of this SVM, we used a KCV with k
=
10 and we searched for the
10 4
10 2
two SVM hyperparameters C and
ʳ
. With C in the range
[
,
]
and
ʳ
between
10 4
10 2
[
,
]
. Both partitioned in 20 points equally spaced in a logarithmic scale. The
second method used was the MultiClass L2-SVM (MC-L2-SVM), which was also
trained using also k
10 and the same partition for its only hyperparameter C .
The confusionmatrices in Table 6.1 depict the classification results obtained given
the 6 BAs using the complete set of features from
=
D 2 . The accuracies achieved with
the two methods are very similar, thus showing the equivalence between these two
models. The linear approach performs slightly better, only varying in a 0.04% with
respect to the non-linear one. Results also show sensitivity and specificity measures
for all the activities.
Some large datasets have shown similar classification performance when linear
or non-linear approaches are used (Schölkopf and Smola 2001 ), meaning that data
mapping into a higher-dimensional space is not always required. The advantage of
a linear kernel is the rapider prediction that can be achieved. The MC-GK-SVM
model selection was performed doing grid-search over two hyperparameters which
is also computationally more expensive than MC-L2-SVM which only uses one.
Our application only considers online prediction while learning is performed offline.
The idea behind this experiment was to study the possibility of employing a linear
classifier for the HAR dataset instead of a more complex approach without risking
recognition performance. This is also justifiable from the SLT perspective in which
the easiest solution that properly classifies the data is always preferred (Vapnik 1995 ).
Taking into account these findings, the linear approach is consequently favored for
the prediction of activities, more specifically for its application in limited resources
devices: in fact, the prediction phase is much faster than the kernelized approach
and linear models allow to exploit more sophisticated dimensionality reduction
approaches, as will be shown in Sect. 6.4.3 . From now on in this thesis, we will
only make use of linear SVM models for HAR.
 
Search WWH ::




Custom Search