Information Technology Reference
In-Depth Information
9.1 Decision Trees with Nominal Attributes
Describing data using a decision tree is both easy on the eye and an excellent
way of understanding our data. Consider, for instance, the play tennis data
presented in Table 9.1, a famous toy dataset in decision tree induction (Quinlan
1986). This dataset concerns the weather conditions that are suitable for play-
ing tennis. As you can see, there are four different attributes - OUTLOOK,
TEMPERATURE, HUMIDITY, and WINDY - and the decision is whether
to play or not depending on the values of the different attributes. In this
particular case, all four attributes have nominal values. For instance, OUT-
LOOK can be “sunny”, “overcast”, or “rainy”; TEMPERATURE can be “hot”,
“mild”, or “cool”; HUMIDITY can be “high” or “normal”; and WINDY can
be “true” or “false”.
Table 9.1
A small training set with nominal attributes.
OUTLOOK
TEMPERATURE
HUMIDITY
WINDY
Play
sunny
hot
high
false
No
sunny
hot
high
true
No
overcast
hot
high
false
Yes
rainy
mild
high
false
Yes
rainy
cool
normal
false
Yes
rainy
cool
normal
true
No
overcast
cool
normal
true
Yes
sunny
mild
high
false
No
sunny
cool
normal
false
Yes
rainy
mild
normal
false
Yes
sunny
mild
normal
true
Yes
overcast
mild
high
true
Yes
overcast
hot
normal
false
Yes
rainy
mild
high
true
No
A decision tree learned from this dataset is presented in Figure 9.1. And
the classification rules from such a tree are inferred from running all the
paths from the top node to all the leaf nodes. For instance, for the decision
tree of Figure 9.1, there are a total of five different paths or classification
rules. And starting at the most leftward path and continuing towards the right,
they are as follow:
Search WWH ::




Custom Search