Information Technology Reference
In-Depth Information
order to solve our multiclass HAR problem. In a similar way, we present in Sect. 6.3
the combined algorithm (MultiClass L1-L2-SVM (MC-L1-L2-SVM)) that allows
to merge the effectiveness of L2 models and the feature selection characteristics
of L1 solutions for HAR. Moreover, we describe our proposed training algorithm
(Extended SMO (EX-SMO)). Experimental results regarding the proposed SVM
approaches, addition of gyroscopes and feature selection mechanisms are presented
in Sect. 6.4 . Finally we summarize the chapter in Sect. 6.5 .
6.2 L1-Norm and L2-Norm SVMs for Activity Recognition
Our target is to design a model, which can be effectively run on smartphones with
limited battery life and computational restrictions. We have thus to identify the sim-
plest possible classifier exploiting the smallest set of features that guarantees the best
performance/computational burden ratio. For these purposes, we peruse the exploita-
tion of linear models, which use only those selected inputs that are crucial to attain
sufficient classification accuracy. In this section we formulate the OVA SVMs with
the L1- and L2-Norms.
In the framework of supervised learning and in the case of binary classification
problems, the goal is to approximate the relationship between examples from a set
X
d elements and a set
composed of x i
1.
This relationship is encapsulated by a fixed, but unknown, probability distribution
P
∈ R
Y
which contains outputs targets y i
= { (
x 1 ,
y 1 ),...,(
x n ,
y n ) }
P
. A training set D n
is sampled according to
.The
learning algorithm maps D n to f
∈ F
with a linear separator in the original space
T x
f
(
x
) = w
+
b . Moreover, the accuracy in representing the hidden relationship
P
is measured with reference to a loss function
(
f
(
x
),
y
)
.
) = 1
)) /
2 seems the
most natural choice, as it counts the number of misclassifications, but unfortunately it
is non-convex. For this reason the hinge loss function
In general, the hard loss function
H (
f
(
x
),
y
y sign
(
f
(
x
ʾ (
f
(
x
),
y
) =
[1
yf
(
x
)
]
+
is exploited instead (Vapnik 1998 ). It is possible to introduce a regularization term
in order to a djust the size of the class. In this case, we choose the Euclidean norm
(
j = 1 w
2
w 2 =
j ), also known as the L2-Norm (Tikhonov and Arsenin 1978 ).
According to the SRM principle (Vapnik 1998 ), we can derive, similarly to Eq. ( 2.2 ) ,
the primal formulation using the L2-Norm in the minimization problem:
1
2 w
2
C 1 n ʾ ,
min
w ,
2 +
s.t. Y
(
X
w +
b n )
1 n ʾ , ʾ
0 n ,
(6.1)
b
, ʾ
x n ] T , y
y n ] T , Y
where
ʾ i
= ʾ (
f
(
x i ),
y i )
, X
=
[ x 1 | ... |
=
[ y 1 | ... |
=
diag
(
y
)
( Y is a diagonal matrix where the element on the diagonal are the y i ∈{ 1 ,..., n }
).
Also by introducing n Lagrange multipliers ʱ we can obtain the dual formulation
(Eq. ( 2.13 ) ) which is a CCQP problem and can be solved through many efficient
 
Search WWH ::




Custom Search