Databases Reference
In-Depth Information
As the tree progresses toward the leaf nodes, at times there is not enough
room to show the split criteria. In these cases, the criteria box either draws a
“ . . . ” or is left blank.
To see the split criteria and node contents, hover over any of the nodes.
The leaf nodes of the tree all contain homogeneous content, except for the
node containing two Versicolor observations and one Virginica observation.
This node represents the one error in the confusion matrix. When the decision
tree is used to make a prediction, the input attribute values are used to navigate
to a leaf node. The most frequently occurring category of that leaf node then
becomes the predicted value.
Exercise 5.1
Use the OliveOil dataset to generate classification models based on the acid
measures.
a. Build classification models to predict Region using the decision tree, ANN,
and SVM classification modelers. Note: The modelers automatically use
all attributes in the dataset for model construction. Since you do not want
to use Area to classify Region, you will first need to create a derived
set that excludes Area, then build the models using the derived set. Look
at the confusion matrices for all three models. How well do they predict
the training set values?
b. Build classification models to predict Area using the decision tree, ANN, and
SVM classification modelers. Look at the confusion matrices. How well do
they predict the training set values? Which modeler performs best? Which
Areas do the models have the most trouble predicting? Hint: The cells in
the matrix off the main diagonal (excluding the totals column and row) with
the tallest bars represent the observations most frequently misclassified.
c. View the tree graph for the decision tree model. Which acid best distin-
guishes the South Apulia oils? Describe the primary distinguishing acids
characteristics of the Inland Sardinia oils.
Prediction Likelihoods
To this point, we have only evaluated input contributions of the decision tree
models using the tree graph. Decision trees are relatively simple structures.
The structure of other models is not as easy to visualize due to the complexity
 
Search WWH ::




Custom Search