Information Technology Reference
In-Depth Information
a.
0123456789012345
OHOWbababbabbaba
b.
OUTLOOK
overcast
sunny
rainy
HUMIDITY
OUTLOOK
WINDY
high
normal
sunny
overcast
rainy
true
false
No
Yes
No
Yes
No
No
Yes
Figure 9.7. Perfect solution to the play tennis problem created in generation 5
(chromosome 2). It solves correctly all the 14 instances of Table 9.1. a) The chromo-
some of the individual. b) The corresponding decision tree.
therefore, both mutation and inversion (the only operators that could intro-
duce a terminal at the root) were modified so that no terminals end up at the
root. Thus, if the target point is the first position of a gene, the mutation
operator can only replace an attribute by another (the mutation at the remain-
ing positions in the gene obeys obviously the usual rules). As for the inver-
sion operator, it was modified so as not to touch the first position of the head.
9.2 Decision Trees with Numeric/Mixed Attributes
Inducing conventional decision trees with numeric attributes is considerably
more complex than inducing DTs with nominal attributes because there are
many more ways of splitting the data and, no wonder, these trees can get
messy very quickly. We will see that gene expression programming handles
numeric attributes with aplomb and is not affected by the messiness that
plagues the induction of conventional decision trees with numeric attributes.
Consider, for instance, a different version of the play tennis dataset, in
which two of the attributes (TEMPERATURE and HUMIDITY) are numeric
(Table 9.3). A decision tree that describes this dataset is presented in Figure
9.8. And the rules in this case are:
Search WWH ::




Custom Search