Large-Scale Machine Learning - Mining of Massive Datasets

Database Reference

In-Depth Information

(c) Suggest a transformation using quadratic polynomials that will transform these

points so they become linearly separable.

12.3 Support-Vector Machines

We can view a support-vector machine , or SVM, as an improvement on the perceptron that

is designed to address the problems mentioned in Section 12.2.7 . An SVM selects one par-

ticular hyperplane that not only separates the points in the two classes, but does so in a way

that maximizes the margin - the distance between the hyperplane and the closest points of

the training set.

12.3.1

The Mechanics of an SVM

The goal of an SVM is to select a hyperplane w . x + b = 0 1 that maximizes the distance

γ between the hyperplane and any point of the training set. The idea is suggested by Fig.

12.14 . There, we see the points of two classes and a hyperplane dividing them.

Figure 12.14 An SVM selects the hyperplane with the greatest possible margin γ between the hyperplane and the train-

ing points

Intuitively, we are more certain of the class of points that are far from the separating

hyperplane than we are of points near to that hyperplane. Thus, it is desirable that all the

training points be as far from the hyperplane as possible (but on the correct side of that

hyperplane, of course). An added advantage of choosing the separating hyperplane to have

as large a margin as possible is that there may be points closer to the hyperplane in the

full data set but not in the training set. If so, we have a better chance that these points will

be classified properly than if we chose a hyperplane that separated the training points but

allowed some points to be very close to the hyperplane itself. In that case, there is a fair

chance that a new point that was near a training point that was also near the hyperplane

would be misclassified. This issue was discussed in Section 12.2.7 in connection with Fig.

12.13 .

Search WWH ::

Custom Search

Home