Classification: Advanced Methods - Data Mining: Concepts and Techniques - page 413

Databases Reference

In-Depth Information

tuples were removed and training were repeated, the same separating hyperplane would

be found. Furthermore, the number of support vectors found can be used to compute

an (upper) bound on the expected error rate of the SVM classifier, which is independent

of the data dimensionality. An SVM with a small number of support vectors can have

good generalization, even when the dimensionality of the data is high.

9.3.2 The Case When the Data Are Linearly Inseparable

In Section 9.3.1 we learned about linear SVMs for classifying linearly separable data, but

what if the data are not linearly separable, as in Figure 9.10? In such cases, no straight

line can be found that would separate the classes. The linear SVMs we studied would

not be able to find a feasible solution here. Now what?

The good news is that the approach described for linear SVMs can be extended to

create nonlinear SVMs for the classification of linearly inseparable data (also called non-

linearly separable data , or nonlinear data for short). Such SVMs are capable of finding

nonlinear decision boundaries (i.e., nonlinear hypersurfaces) in input space.

“ So ,” you may ask, “ how can we extend the linear approach? ” We obtain a nonlinear

SVM by extending the approach for linear SVMs as follows. There are two main steps.

In the first step, we transform the original input data into a higher dimensional space

using a nonlinear mapping. Several common nonlinear mappings can be used in this

step, as we will further describe next. Once the data have been transformed into the

new higher space, the second step searches for a linear separating hyperplane in the new

space. We again end up with a quadratic optimization problem that can be solved using

the linear SVM formulation. The maximal marginal hyperplane found in the new space

corresponds to a nonlinear separating hypersurface in the original space.

A 2

Class 1, y

=+

1 ( buys _ computer

=

yes )

Class 2, y

=−

1 ( buys _ computer

=

no )

A 1

Figure 9.10 A simple 2-D case showing linearly inseparable data. Unlike the linear separable data of

Figure 9.7, here it is not possible to draw a straight line to separate the classes. Instead, the

decision boundary is nonlinear.

Next Page

Data Mining: Concepts and Techniques

Search WWH ::

Custom Search

Home