Database Reference
In-Depth Information
Chapter 11
Advanced Decision Trees
11.1 Oblivious Decision Trees
Oblivious decision trees are those in which all nodes at the same level test
the same feature. Despite its restriction, oblivious decision trees are effective
for feature selection. Almuallim and Dietterich (1994) as well as Schlimmer
(1993) have proposed a forward feature selection procedure by constructing
oblivious decision trees, whereas Langley and Sage (1994) suggested
backward selection using the same means. Kohavi and Sommerfield (1998)
have shown that oblivious decision trees can be converted to a decision
table.
Figure 11.1 illustrates a typical oblivious decision tree with four input
features:Glucoselevel(G),Age(A),Hypertension (H) and Pregnant (P)
and the Boolean target feature representing whether that patient suffers
from diabetes. Each layer is uniquely associated with an input feature by
representing the interaction of that feature and the input features of the
previous layers. The number that appears in the terminal nodes indicates
the number of instances that fit this path. For example, regarding patients
whose glucose level is less than 107 and whose age is greater than 50, 10 are
positively diagnosed with diabetes while 2 are not diagnosed with diabetes.
The principal difference between the oblivious decision tree and a
regular decision tree structure is the constant ordering of input attributes
at every terminal node of the oblivious decision tree. This latter property
is necessary for minimizing the overall subset of input attributes (resulting
in dimensionality reduction). The arcs that connect the terminal nodes and
the nodes of the target layer are labeled with the number of records that
fit this path.
An oblivious decision tree is usually built by a greedy algorithm, which
tries to maximize the mutual information measure in every layer. The
167
Search WWH ::




Custom Search