Information Technology Reference
In-Depth Information
As you can easily check, these classification rules classify correctly all the 14
instances of Table 9.3 and, therefore, they are a perfect solution to the play
tennis problem.
So, in order to implement such decision trees in gene expression program-
ming one needs to find a way of encoding and fine-tuning the constants (such
as the constant 76 in the decision tree presented in Figure 9.8) that are re-
quired by the numeric attributes to split the data. How this is accomplished is
explained below.
9.2.1 The Architecture
The decision trees with numeric attributes of gene expression programming
explore an architecture very similar to the one used for polynomial induction
(see chapter 7 Polynomial Induction and Time Series Prediction). In this
case, though, a Dc domain with the same size as the head is used to encode
the constants used for splitting the numeric attributes.
Consider, for instance, the gene below with a head size of five (the Dc is
shown in bold):
012345678901234567890
WOTHabababbbabba 46336 (9.3)
Its expression results in the following decision tree:
W
4
O
T
6
3
a
a
H
b
b
3
6
a
b
As you can see, every node in the head, irrespective of its type (numeric
attribute, nominal attribute, or terminal), has associated with it a random
numerical constant that, for simplicity, is represented by a numeral 0-9. These
RNCs are encoded in the Dc domain and, as shown above, their expression
follows a very simple path: from top to bottom and from left to right, the
elements in Dc are assigned one-by-one to the elements in the tree. So, for
the following array of random numerical constants:
Search WWH ::




Custom Search