Information Technology Reference
In-Depth Information
mapping function M ( x ) could more probably be linearly separable than in the low-
dimensional input space of the vector (see Figure 10.1).
Support Vectors
H opt
x
M
x
Nonlinear
Mapping
Low-dimensional
Data Space
High-dimensional
Feature Space
Figure 10.1. Nonlinear mapping from input space into feature space
In the high-dimensional feature space, if the data are nonlinearly separable in
the low-dimensional data space, then linear separability of features could be
achieved by constructing a hyperplane as a linear discriminant. In this way, a data
classifier can be built, as illustrated by the following example.
Let a set of labelled training patterns (, )
i
x y be available, with i = 1, 2, , N ,
where x is an n -dimensional pattern vector and y is the desired output
corresponding to the input pattern vector x , the values of which belong to the
linearly separable classes A with y = - 1 and B with y =+ 1. The postulated
separability condition implies that there exists an n -dimensional weight vector w
and a scalar b such that
i
y = - 1,
T
wx
b
for
0
(10.1)
i
y =+ 1,
T
wx
b
t for
0
(10.2)
i
whereby the parametric equation
wx
T
b
0
(10.3)
i
defines the n -dimensional separating hyperplane. The data point nearest to the
hyperplane is called the margin of separation . The objective is to determine a
specific hyperplane that maximizes this margin between the two classes, called the
optimal hyperplane
H
, defined by the parametric equation for
H
in the
opt
opt
feature space
T
wx b
0
(10.4)
0
i
0
Search WWH ::




Custom Search