Decision Forests - Data Mining with Decision Trees: Theory and Applications

Database Reference

In-Depth Information

9.3.1.13 Order Statistics

Order statistics can be used to combine classifiers [ Tumer and Ghosh

(2000) ] . These combiners offer the simplicity of a simple weighted combi-

nation method together with the generality of meta-combination methods

(see the following section). The robustness of this method is helpful when

there are significant variations among classifiers in some part of the instance

space.

9.3.2

Meta-combination Methods

Meta-learning means learning from the classifiers produced by the inducers

and from the classifications of these classifiers on training data. The follow-

ing sections describe the most well-known meta-combination methods.

9.3.2.1 Stacking

Stacking is a technique for achieving the highest generalization accuracy

[ Wolpert (1992) ] . By using a meta-learner, this method tries to induce

which classifiers are reliable and which are not. Stacking is usually employed

to combine models built by different inducers. The idea is to create a

meta-dataset containing a tuple for each tuple in the original dataset.

However, instead of using the original input attributes, it uses the predicted

classifications by the classifiers as the input attributes. The target attribute

remains as in the original training set. A test instance is first classified

by each of the base classifiers. These classifications are fed into a meta-

level training set from which a meta-classifier is produced. This classifier

combines the different predictions into a final one.

It is recommended that the original dataset should be partitioned into

two subsets. The first subset is reserved to form the meta-dataset and

the second subset is used to build the base-level classifiers. Consequently,

the meta-classifier predications reflect the true performance of base-level

learning algorithms.

Stacking performance can be improved by using output probabilities for

every class label from the base-level classifiers. In such cases, the number of

input attributes in the meta-dataset is multiplied by the number of classes.

It has been shown that with stacking the ensemble performs (at best)

comparably to selecting the best classifier from the ensemble by cross-

validation [ Dzeroski and Zenko (2004) ] . In order to improve the existing

stacking approach, they employed a new multi-response model tree to learn

Search WWH ::

Custom Search

Home