Information Technology Reference

In-Depth Information

Fig. 22.2

Support vector

machines

to avoid bad local optimum. However, the time complexity will correspondingly

increase.

22.2.2 Support Vector Machines

Intuitively a support vector machine constructs a hyperplane in a high or infinite di-

mensional space, which can be used for classification. A good separation is achieved

by the hyperplane that has the largest distance (i.e., margin) to the nearest training

sample of any class, since in general the larger the margin the lower the gener-

alization error of the classifier. In this regard, support vector machines belong to

large-margin classifiers.

For ease of discussion, we start with the linear model
w
T
x
. If the training data

are linearly separable, then the following constraints are to be satisfied:

y
i
w
T
x
i
+

b
≥

1
.

Given the constraints, the goal is to find the hyperplane with the maximum mar-

gin. It is not difficult to see from Fig.
22.2
that the margin can be represented by

2

. Therefore, we have the following optimization problem:

w

min
1

2
,

2

w

y
i
w
T
x
i
+

b
≥

s.t.

1
.

If the data are not linearly separable, the constraints cannot be satisfied. In this

case, the concept of the soft margin is introduced, and the optimization problem

becomes

C

i

min
1

2

2

w

+

ξ
i
,

y
i
w
T
x
i
+
b
≥

s.t.

1

−
ξ
i
,
i
≥

0
.