Information Technology Reference
In-Depth Information
subject to the following constraints:
of 0(n 3 ) where n is the number of training points.
Since the transformation from the linear to the
non-linear case is performed by the simple kernel
transformation, the dimension of the Hessian
matrix is not changed and hence the processing
time is the same, thus its applicability and high
performance with multi-dimensional data.
Previous work done by our team has shown
that subject independent agitation detection was
possible, by monitoring three vital signs: HR,
GSR and RTD and using SVM it was possible
to detect the agitation state of subject even if the
device was not trained on that specific subject.
This was achieved with an accuracy that reached
84% (Sakr, et al., 2008). The limitation of the
method was the issue of the “gray zone”. The gray
zone is the transitional phase where the subject
is passing from one state to another. Within this
area fall more than 95% of the detection errors
of the method. Another work done by our team
has also shown that using SVM with a hierarchi-
cal architecture that takes into consideration the
distance between the two classes and hence giving
the SVM a “zooming” ability, gave high accuracy
reported at 95.1%, which was higher than what
was previously reported in the literature. Also in
this case the errors that were falling in the “gray
zone” were of high percentage (Sakr, et al., 2009).
In order to further improve accuracy we introduced
a new confidence measure on the decision of a
support vector machine. This new confidence
measure is used to modify a single 2-class SVM
classifier into multi-level 2-class SVM classifier
architecture. This new architecture is used for
agitation detection and yields, using the same
training set, higher accuracy than a single 2-class
SVM classifier.
y T a = 0; a > 0
where H denotes the Hessian matrix given by: H
= y i y j ( x i x j ) and f is the unit vector f = [1, 1...1] T .
Having the solutions a 0i of the dual optimization
problem will be sufficient to determine the weight
vector and the bias using the following equation:
l
y x
1
w
=
a oi
i
i
i
=
N
1
1
b
x
T
=
w
i
N
y
i
=
1
i
where N represents the number of support vectors.
The linear classifier presented above has limited
capabilities since it is only used with linearly
separable data while in most practical applications
data is not linearly separable.
The nonlinear data has to be mapped to a new
feature space of higher dimension using a suitable
mapping function Φ which is of very high dimen-
sion, potentially infinite. Fortunately, in all the
equations, this function appears only in the form
of a dot product. From the theory of reproducing
kernel Hilbert spaces (Aronszajn, 1950), which
is beyond the scope of this chapter, a kernel func-
tion is defined as:
(
) = ( )
T
( )
K x x
,
Φ
x
Φ
x
i
j
i
j
By replacing the dot product in all the previous
equations, the non-linear hyperplane is deter-
mined. This remarkable characteristic of the kernel
transformation gives the ability for support vector
machines to operate on multi-dimensional data
without affecting the processing time. Indeed in
the linear case, the processing time is roughly the
time needed to invert the Hessian matrix which is
Multi-Level SVM
The proposed confidence measure is based on a
dimension proposed by Vapnik and Chervonenkis
which was named after them: the VC dimension.
Search WWH ::




Custom Search