Information Technology Reference
In-Depth Information
OUTLOOK
sunny
overcast
rainy
HUMIDITY
Ye s
WINDY
high
normal
true
false
No
Yes
No
Yes
Figure 9.1.
Decision tree for deciding whether to play tennis or not.
1.
IF
OUTLOOK = sunny
AND
HUMIDITY = high
THEN
PLAY = No;
2.
IF
OUTLOOK = sunny
AND
HUMIDITY = normal
THEN
PLAY = Yes;
3.
IF
OUTLOOK = overcast
THEN
PLAY = Yes;
4.
IF
OUTLOOK = rainy
AND
WINDY = true
THEN
PLAY = No;
5.
IF
OUTLOOK = rainy
AND
WINDY = false
THEN
PLAY = Yes.
As you can easily check, these rules classify correctly all the 14 instances of
Table 9.1 and, therefore, are a perfect solution to the problem at hand.
Let's now see how such decision trees with nominal attributes can be in-
duced with gene expression programming.
9.1.1 The Architecture
Gene expression programming can be used to induce decision trees by deal-
ing with the attributes as if they were functions and the leaf nodes as if they
were terminals. Thus, for the play tennis data of Table 9.1, the attribute set
A
will consist of OUTLOOK, TEMPERATURE, HUMIDITY, and WINDY,
which will be respectively represented by “O”, “T”, “H”, and “W”, thus
giving A = {O, T, H, W}. Furthermore, all these attribute nodes have associ-
ated with them a specific arity or number of branches
n
that will determine
their growth and, ultimately, the growth of the tree. For instance, OUTLOOK
is split into three branches (sunny, overcast, and rainy); HUMIDITY into
two branches (high and normal); TEMPERATURE into three (hot, mild, and
cool); and WINDY into two (true and false). The terminal set
T
will consist
in this case of “Yes” and “No” (the two different outcomes of the class