Database Reference
In-Depth Information
Figure 12.11 A training set may not allow the existence of any separating hyperplane
One might argue that, based on the observations of Section 12.2.6 , it should be possible
to find some function on the points that would transform them to another space where they
were linearly separable. That might be the case, but if so, it would probably be an example
of overfitting , the situation where the classifier works very well on the training set, because
it has been carefully designed to handle each training example correctly. However, because
the classifier is exploiting details of the training set that do not apply to other examples that
must be classified in the future, the classifier will not perform well on new data.
Another problem is illustrated in Fig. 12.12 . Usually, if classes can be separated by
one hyperplane, then there are many different hyperplanes that will separate the points.
However, not all hyperplanes are equally good. For instance, if we choose the hyperplane
that is furthest clockwise, then the point indicated by “?” will be classified as a circle, even
though we intuitively see it as closer to the squares. When we meet support-vector ma-
chines in Section 12.3 , we shall see that there is a way to insist that the hyperplane chosen
be the one that in a sense divides the space most fairly.
Figure 12.12 Generally, more that one hyperplane can separate the classes if they can be separated at all
Yet another problem is illustrated by Fig. 12.13 . Most rules for training a perceptron stop
as soon as there are no misclassified points. As a result, the chosen hyperplane will be one
that just manages to classify some of the points correctly. For instance, the upper line in
Fig. 12.13 has just managed to accommodate two of the squares, and the lower line has
just managed to accommodate one of the circles. If either of these lines represent the final
weight vector, then the weights are biased toward one of the classes. That is, they correctly
classify the points in the training set, but the upper line would classify new squares that are
just below it as circles, while the lower line would classify circles just above it as squares.
Again, a more equitable choice of separating hyperplane will be shown in Section 12.3 .
Search WWH ::




Custom Search