Database Reference
In-Depth Information
6.2.7
Minimum Description Length
(
MDL
)
Pruning
MDL can be used for evaluating the generalized accuracy of a node
( [ Rissanen (1989) ] , [ Quinlan and Rivest (1989) ] and [ Mehta et al . (1995) ] ).
This method measures the size of a decision tree by means of the number
of bits required to encode the tree. The MDL method prefers decision trees
that can be encoded with fewer bits. Mehta et al . (1995) indicate that the
cost of a split at a leaf t can be estimated as:
Cost(t) =
c i ∈dom ( y )
|
S t |
+ |
dom ( y )
|−
ln |
S t |
2
1
|
σ y = c i S t
ln
|
σ y = c i S t |
2
(6.6)
π |dom ( y ) |
2
Γ( |dom ( y ) |
2
+ln
,
)
where S t denotes the instances that have reached node t . The splitting
cost of an internal node is calculated based on the cost aggregation of its
children.
6.2.8
Other Pruning Methods
There are other pruning methods reported in the literature. Wallaces
and Patrick (1993) proposed a Minimum Message Length (MML) pruning
method while Kearns and Mansour (1998) provide a theoretically-justified
pruning algorithm. Mingers (1989) proposed the Critical Value Pruning
(CVP). This method, which prunes an internal node if its splitting criterion
is not greater than a certain threshold, is similar to a stopping criterion.
However, contrary to a stopping criterion, a node is not pruned if at least
one of its children does not fulfill the pruning criterion.
6.2.9
Comparison of Pruning Methods
Several studies compare the performance of different pruning techniques:
( [ Quinlan (1987) ] , [ Mingers (1989) ] and [ Esposito et al . (1997) ] ). The results
indicate that some methods (such as Cost-Complexity Pruning, Reduced
Error Pruning) tend to over-pruning, i.e. creating smaller but less accurate
decision trees. Other methods (like error-based pruning, pessimistic error
pruning and minimum error pruning) bias toward under-pruning. Most of
the comparisons concluded that the no free lunch theorem also applies
to pruning, namely, there is no pruning method that outperforms other
pruning methods.
Search WWH ::




Custom Search