Decision Forests - Data Mining with Decision Trees: Theory and Applications

Database Reference

In-Depth Information

9.6.4

Pruning — Post Selection of the Ensemble Size

As in decision tree induction, it is sometimes useful to let the ensemble

grow freely and then prune the ensemble in order to get more effective

and compact ensembles. Post selection of the ensemble size allows ensemble

optimization for such performance metrics as accuracy, cross entropy, mean

precision, or the ROC area. Empirical examinations indicate that pruned

ensembles may obtain a similar accuracy performance as the original

ensemble [ Margineantu and Dietterich (1997) ] . In another empirical study

that was conducted in order to understand the affect of ensemble sizes on

ensemble accuracy and diversity, it has been shown that it is feasible to

keep a small ensemble while maintaining accuracy and diversity similar to

those of a full ensemble [ Liu et al . (2004) ] .

The pruning methods can be divided into two groups: pre-combining

pruning methods and post-combining pruning methods.

9.6.4.1 Pre-combining Pruning

Pre-combining pruning is performed before combining the classifiers. Clas-

sifiers that seem to perform well are included in the ensemble. Prodromidis

et al . (1999) present three methods for pre-combining pruning: based on an

individual classification performance on a separate validation set, diversity

metrics, the ability of classifiers to classify correctly specific classes.

In attribute bagging, classification accuracy of randomly selected

m -attribute subsets is evaluated by using the wrapper approach and only

the classifiers constructed on the highest ranking subsets participate in the

ensemble voting.

9.6.4.2 Post-combining Pruning

In post-combining pruning methods, we remove classifiers based on their

contribution to the collective.

Prodromidis examines two methods for post-combining pruning assum-

ing that the classifiers are combined using meta-combination method: Based

on decision tree pruning and the correlation of the base classifier to the

unpruned meta-classifier.

A forward stepwise selection procedure can be used in order to select

the most relevant classifiers (that maximize the ensemble's performance)

among thousands of classifiers [ Caruana et al . (2004) ] . It has been shown

that for this purpose one can use feature selection algorithms. However,

Search WWH ::

Custom Search

Home