Advanced Decision Trees - Data Mining with Decision Trees: Theory and Applications

Database Reference

In-Depth Information

to that presented in Figure 3.1. But, instead of looking for a split with a

single attribute, it looks for the best function of the input attributes. In each

iteration, the algorithm considers the partition of the training set using the

outcome of a discrete function of the input attributes. The selection of the

most appropriate function is made according to some splitting measures.

After the selection of an appropriate split, each node further subdivides

the training set into smaller subsets, until no split gains sucient splitting

measures or a stopping criterion is satisfied. Because there are endless

functions, the main challenge is to decide on which functions the algorithm

should concentrate.

Most of the multivariate splitting criteria are based on the linear

combination of the input attributes. In this case the algorithm constructs

hyperplanes that are oblique, that is, not parallel to a coordinate axis.

Figure 11.7 illustrates an Oblique Decision Tree. The left node in the second

level tests a linear combination of the two input attributes 3

· YRSJOB−

2

. It is reasonable to assume that oblique decision trees would

require several less planes then a regular decision tree, resulting in a smaller

tree.

Finding the best linear combination can be performed using greedy

search ( [ Breiman et al . (1984) ] , [ Murthy (1998) ] ); linear programming

( [ Duda and Hart (1973) ] ; [ Bennett and Mangasarian (1994) ] ); linear dis-

criminant analysis ( [ Duda and Hart (1973) ] ; [ Friedman (1977) ] ; [ Sklansky

and Wassel (1981) ] ; [ Lin and Fu (1983) ] ; [ Loh and Vanichsetakul (1988) ] ;

· DEPEND

LT V

<75%

≥

75%

3*YRSJOB-

2*DEPEND

MARST

<1

≥ 1

A

M

D

A

Fig. 11.7

Illustration of oblique decision tree.

Search WWH ::

Custom Search

Home