Digital Signal Processing Reference
In-Depth Information
with the side conditions
0
a l
C
,
l
=
1
,...,
L
,
(7.16)
L
a l y l =
0
.
(7.17)
l
=
1
The hyper plane is then defined by
L
w =
a l y l x l ,
(7.18)
l
=
1
x l w l .
b
=
y l (
1
ξ l )
(7.19)
Thereby l
represents the index of the vector x l
with the largest coefficient a l .
The normal vector
w
is thus represented as weighted sum of training instances with
the coefficients a l
L , where C is another free parameter to be
determined. By the introduction of the weighting coefficients the slack variables
ξ l disappear in the optimisation problem. The support vectors are then the training
instances x l
C
,
l
=
1
,...,
0.
By this, L 2 terms of the form x k x l result, which can be summarised as a matrix.
One of the frequently used and highly efficient methods for the recursive computation
of this matrix and by that solving of the dual problem is the Sequential Minimal
Optimisation (SMO), which is introduced in detail in [ 9 ]. The classification by SVM
is now given by the function d w, b :
that satisfy a l >
X
→{−
1
, +
1
}
,
T x
d
(
x
) =
sgn
(w
+
b
)
(7.20)
w,
b
where
1
u
0
sgn
(
u
) =
(7.21)
1
u
<
0
.
So far, we are only able to solve pattern recognition problems that assign the
instances belonging to the (two) different classes with a certain acceptable error by a
hyper plane in the space X . This is referred to as linear seperation problem. Aiming at
classes that can only be separated non-linearly, one applies the so called 'kernel trick'
[ 10 ]. Figure 7.4 depicts an exemplary two-class problem in one-dimensional space,
which can only be solved linearly by a mapping into a higher (two-)dimensional
space—without error in the given example.
In general, such a transformation is given by the mapping
X
X
Φ :
X
,
dim
(
)>
dim
(
X
).
(7.22)
 
Search WWH ::




Custom Search