Classification Models in VisMiner - Visual Data Mining: The VisMiner Approach

Databases Reference

In-Depth Information

Figure 5.9 Derived Datasets

As you work with this dataset in the instructions that follow, your results will

vary slightly from those shown in the examples. The split of rows into training

and validation sets is a random process. Hence, your split will not exactly match

the split used to prepare the examples.

Create two classifiers by dragging the “Dec Tree Classifier” and “SVM

Classifier” modelers down to the Target dataset. Select Buyer as the

classification column.

View the confusion matrix for each model after it has finished processing.

The decision tree should finish first.

You do not have to wait for the SVM to finish before opening the decision

tree's confusion matrix. How well does each perform? In the Target dataset

created, the rates for the decision tree classifier were 11.3% (training) and

14.1% (validation). Because there is not a large difference between the training

and validation rates, we can conclude that it is not overfit; it generalizes well.

The SVM classifier is quite different. The error rate for the training data is an

almost perfect 1.0%, but soars to 16.5% in the validation data. Clearly the SVM

model is overfit.

Now that you have seen the respective results for the decision tree and SVM

classifiers, interactively create an ANN classifier using the Target data. Try

Search WWH ::

Custom Search

Home