Decision Tree Induction - Gene Expression Programming

Information Technology Reference

In-Depth Information

As you can easily check, these classification rules classify correctly all the 14

instances of Table 9.3 and, therefore, they are a perfect solution to the play

tennis problem.

So, in order to implement such decision trees in gene expression program-

ming one needs to find a way of encoding and fine-tuning the constants (such

as the constant 76 in the decision tree presented in Figure 9.8) that are re-

quired by the numeric attributes to split the data. How this is accomplished is

explained below.

9.2.1 The Architecture

The decision trees with numeric attributes of gene expression programming

explore an architecture very similar to the one used for polynomial induction

(see chapter 7 Polynomial Induction and Time Series Prediction). In this

case, though, a Dc domain with the same size as the head is used to encode

the constants used for splitting the numeric attributes.

Consider, for instance, the gene below with a head size of five (the Dc is

shown in bold):

012345678901234567890

WOTHabababbbabba 46336 (9.3)

Its expression results in the following decision tree:

W

4

O

T

6

3

a

H

b

3

6

a

b

As you can see, every node in the head, irrespective of its type (numeric

attribute, nominal attribute, or terminal), has associated with it a random

numerical constant that, for simplicity, is represented by a numeral 0-9. These

RNCs are encoded in the Dc domain and, as shown above, their expression

follows a very simple path: from top to bottom and from left to right, the

elements in Dc are assigned one-by-one to the elements in the tree. So, for

the following array of random numerical constants:

Search WWH ::

Custom Search

Home