Decision Tree Induction - Gene Expression Programming

Information Technology Reference

In-Depth Information

Table 9.9

Organization of the postoperative patient dataset.

Branches

Attribute

Symbol

Arity

L-CORE

A

high, low, mid

3

L-SURF

B

high, low, mid

3

L-O2

C

excellent, good

2

L-BP

D

high, low, mid

3

SURF-STBL

E

stable, unstable

2

CORE-STBL

F

mod-stable, stable, unstable

3

BP-STBL

G

mod-stable, stable, unstable

3

COMFORT

H

05, 07, 10, 15, ?

5

For this experiment, a sub-set of 60 samples was randomly selected for

training and the remaining 30 were used for testing (both these sets are avail-

able at the gene expression programming website). The fitness function was

again based on the number of hits and was evaluated by equation (3.8). As

shown in Table 9.9, the eight attributes were represented by A = {A, ..., H},

splitting respectively into 3, 3, 2, 3, 2, 3, 3, and 5 branches (note that “H”

divides into five branches due to the presence of missing values, which, as

you can see, are handled as just another branch). The terminal set consisted

of T = {a, b, c}, representing respectively classes “A”, “I”, and “S”. Both the

performance and the parameters used per run are shown in Table 9.10.

And as you can see, the EDT algorithm performs quite well at this task

with an average best-of-run fitness of 47.14. Indeed, several good solutions

were designed in this experiment, and two of them are shown below:

GDAaHaBcBFcaaabaccaacaacab...

...acbcabbbcacaaabcaaaacccca (9.10)

GaAEcBFBAaacaacacaacacbccc...

...bbbbcbbbcbbcacccbababcabc (9.11)

As you can see by drawing the trees, the first one encodes a decision tree

with a total of 24 nodes, whereas the second one encodes a DT with 21

nodes. These highly compact models are extremely accurate: the first one

has a training fitness of 50 (83.33% accuracy) and a testing fitness of 23

(76.67% accuracy), whereas the second one has a training fitness of 49

(81.67% accuracy) and a testing fitness of 24 (80.00% accuracy) and are,

Gene Expression Programming

Search WWH ::

Custom Search

Home