Information Technology Reference
In-Depth Information
Table 4.7 Performance
comparison of the ML
classifiers
Classifier
Acronym
Accuracy (%)
Decision tree C4.5
DT
83.06
Random forest
RF
89.34
k-Nearest neighbors
k-NN
88.19
Naive bayes
NB
78.89
Logistic regression
LR
96.40
Multilayer perceptron
MLP
94.60
Support vector machine
SVM
96.50
Environment for Knowledge Analysis (WEKA) ML software suite for dealing with
data (Hall et al. 2009 ) was used for this validation task. For the experiments, we
worked with the
D 2 data partition as it contains the all the BAs and the complete
set available features from both inertial sensors. For each of the ML algorithms used
we estimated its performance in terms of classification accuracy of the test data. The
results are depicted in Table 4.7 .
Classification results show evident differences between the performance of the
selected ML algorithms. We found that some algorithms have a relatively low perfor-
mance, with accuracies below 90%. These are DT, RF, k-NN and NB. On the other
hand, we observed a better outcome in the remaining three MLalgorithms: MLP, LR,
and SVM. The last two have comparable classification accuracy with SVM slightly
outperforming, with a 96.50%, and only differing by 0.1%. These results reinforce
SVMs as a good candidate for performing HAR with smartphone inertial data. This
algorithm is the one selected in our research and from now on we focus our attention
on its study and implementation.
In the particular case of SVMs, we used a multiclass SVMmodel through an OVA
approach which take one binary SVMwith Gaussian Kernel (GK-SVM) per activity.
From now on, we will refer to this SVM configuration as MultiClass GK-SVM (MC-
GK-SVM). This Radial Basis Function (RBF) kernel is commonly used in SVMs
because it has shown to deal successfully with non-linear data and it is considered a
universal approximator (Wang et al. 2004 ). We employed the recognized LIBSVM
library (Chang and Lin 2011 ) which can be run under WEKA or Matlab.
The classification results for the
D 2 dataset are presented in Table 4.8 as a con-
fusion matrix. It includes the classification accuracy of the algorithm along with the
sensitivity and specificity measures for each class. They show an overall accuracy of
96
50%for the test data composed of 2,947 patterns. In the sameway, Table 4.8 shows
the performance of
.
D 3 using the same classification algorithm (MC-GK-SVM). It
achieves an accuracy of 95
61%. Notice that their difference relies on the addition
of the PT class which combines all the available transitions into one. Moreover, the
number of window samples of this class is smaller than the other classes. However,
we take this into account during the training in order to balance the data through the
C hyperparameter of the binary SVM. This dataset will be covered more in detail
in Chap. 7 where we clarify how this additional class is needed for improving the
online HAR system.
.
 
Search WWH ::




Custom Search