Digital Signal Processing Reference
In-Depth Information
1
0.5
0
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
x 1
x 1
Fig. 7.4 Solving of an exemplary two-class problem by mapping into higher dimensional space:
While in the one-dimensional (original) space the problem cannot be solved linearly, mapping
by the function
( x 1 , x 1 )
Φ :
x 1
allows for error-free separation in the new two-dimensional
space [ 1 ]
The normal vector
w
then results in
w =
a l y l Φ(
x l ).
(7.23)
l
:
a l >
0
The decision function d
(
x
)
results—applying
Φ
—in:
w,
b
T
d
(
x
) =
sgn
(w
Φ(
x
) +
b
).
(7.24)
w,
b
As
T
T
w
Φ(
x
) =
a l y l Φ(
x l )
Φ(
x
),
(7.25)
l
:
a l >
0
Φ
the transformation
is explicitly neither needed for the estimation of the parame-
ters of the classifier, nor for the classification. Instead a so called 'kernel function'
K Φ (
x )
x
,
is being defined, with the condition
K Φ (
x ) = Φ(
T
x ).
x
,
x
)
Φ(
(7.26)
The kernel function additionally needs to be positively semi-definite, symmetric,
and fulfil the Cauchy-Schwarz inequality. The optimal kernel function for a given
classification or regression problem can only be found empirically. However, recently
so called multi kernels try to overcome the search for optimal kernel functions [ 11 ].
Most frequently used kernel functions comprise:
Polynomial kernel:
K p (
x ) = (
x T x +
p
x
,
1
)
,
(7.27)
where p is the polynomial order,
 
 
Search WWH ::




Custom Search