Decision Tree Induction - Gene Expression Programming - page 369

Information Technology Reference

In-Depth Information

classifies correctly all the testing instances, which corresponds to a training

accuracy of 99.0% and a testing set accuracy of 100%:

0123456789012345678901234567890

PQQPTcQabcbbccccbcbcc9336038121

C = {3.16, 2.61, 1.76, 1.61, 3.11, 5.64, 2.25, 1.58, 1.74, 4.91} (9.7)

As you can see by its expression in Figure 9.18, it encodes a very compact

decision tree with just 13 nodes with the PETAL_LENGTH (“P”) at the

root. Note again that this time it was the attribute SEPAL_LENGTH (“S”)

that was not used to distinguish between the three kinds of iris plants.

Let's now see how the EDT-RNC algorithm deals with complex problems

with mixed attributes.

a.

0123456789012345678901234567890

PQQPTcQabcbbccccbcbcc9336038121

C = {3.16, 2.61, 1.76, 1.61, 3.11, 5.64, 2.25, 1.58, 1.74, 4.91}

b.

PETAL_LEN

d

>

4.91

4.91

PETAL_WID

PETAL_WID

d

>

d

>

1.61

1.61

1.61

1.61

PETAL_LEN

SEPAL_WID

Vir

PETAL_WID

d

>

d

>

d

2.25

2.25

3.16

3.16

1.74

>

1.74

Set

Ver

Vir

Ver

Ver

Vir

Figure 9.18. Testing the generalizing capabilities of the EDT-RNC algorithm on the

iris data. a) The linear representation of the model. b) The decision tree model. This

model has 99.0% accuracy on the training set and generalizes outstandingly well

on the testing set, with 100% accuracy.

Next Page

Gene Expression Programming

Search WWH ::

Custom Search

Home