Databases Reference
In-Depth Information
Figure 5.9 Derived Datasets
As you work with this dataset in the instructions that follow, your results will
vary slightly from those shown in the examples. The split of rows into training
and validation sets is a random process. Hence, your split will not exactly match
the split used to prepare the examples.
Create two classifiers by dragging the “Dec Tree Classifier” and “SVM
Classifier” modelers down to the Target dataset. Select Buyer as the
classification column.
View the confusion matrix for each model after it has finished processing.
The decision tree should finish first.
You do not have to wait for the SVM to finish before opening the decision
tree's confusion matrix. How well does each perform? In the Target dataset
created, the rates for the decision tree classifier were 11.3% (training) and
14.1% (validation). Because there is not a large difference between the training
and validation rates, we can conclude that it is not overfit; it generalizes well.
The SVM classifier is quite different. The error rate for the training data is an
almost perfect 1.0%, but soars to 16.5% in the validation data. Clearly the SVM
model is overfit.
Now that you have seen the respective results for the decision tree and SVM
classifiers, interactively create an ANN classifier using the Target data. Try
Search WWH ::




Custom Search