Database Reference
In-Depth Information
to that presented in Figure 3.1. But, instead of looking for a split with a
single attribute, it looks for the best function of the input attributes. In each
iteration, the algorithm considers the partition of the training set using the
outcome of a discrete function of the input attributes. The selection of the
most appropriate function is made according to some splitting measures.
After the selection of an appropriate split, each node further subdivides
the training set into smaller subsets, until no split gains sucient splitting
measures or a stopping criterion is satisfied. Because there are endless
functions, the main challenge is to decide on which functions the algorithm
should concentrate.
Most of the multivariate splitting criteria are based on the linear
combination of the input attributes. In this case the algorithm constructs
hyperplanes that are oblique, that is, not parallel to a coordinate axis.
Figure 11.7 illustrates an Oblique Decision Tree. The left node in the second
level tests a linear combination of the two input attributes 3
· YRSJOB−
2
. It is reasonable to assume that oblique decision trees would
require several less planes then a regular decision tree, resulting in a smaller
tree.
Finding the best linear combination can be performed using greedy
search ( [ Breiman et al . (1984) ] , [ Murthy (1998) ] ); linear programming
( [ Duda and Hart (1973) ] ; [ Bennett and Mangasarian (1994) ] ); linear dis-
criminant analysis ( [ Duda and Hart (1973) ] ; [ Friedman (1977) ] ; [ Sklansky
and Wassel (1981) ] ; [ Lin and Fu (1983) ] ; [ Loh and Vanichsetakul (1988) ] ;
· DEPEND
LT V
<75%
75%
3*YRSJOB-
2*DEPEND
MARST
<1
1
A
M
D
D
A
Fig. 11.7
Illustration of oblique decision tree.
Search WWH ::




Custom Search