Database Reference
In-Depth Information
9.3.1.13 Order Statistics
Order statistics can be used to combine classifiers [ Tumer and Ghosh
(2000) ] . These combiners offer the simplicity of a simple weighted combi-
nation method together with the generality of meta-combination methods
(see the following section). The robustness of this method is helpful when
there are significant variations among classifiers in some part of the instance
space.
9.3.2
Meta-combination Methods
Meta-learning means learning from the classifiers produced by the inducers
and from the classifications of these classifiers on training data. The follow-
ing sections describe the most well-known meta-combination methods.
9.3.2.1 Stacking
Stacking is a technique for achieving the highest generalization accuracy
[ Wolpert (1992) ] . By using a meta-learner, this method tries to induce
which classifiers are reliable and which are not. Stacking is usually employed
to combine models built by different inducers. The idea is to create a
meta-dataset containing a tuple for each tuple in the original dataset.
However, instead of using the original input attributes, it uses the predicted
classifications by the classifiers as the input attributes. The target attribute
remains as in the original training set. A test instance is first classified
by each of the base classifiers. These classifications are fed into a meta-
level training set from which a meta-classifier is produced. This classifier
combines the different predictions into a final one.
It is recommended that the original dataset should be partitioned into
two subsets. The first subset is reserved to form the meta-dataset and
the second subset is used to build the base-level classifiers. Consequently,
the meta-classifier predications reflect the true performance of base-level
learning algorithms.
Stacking performance can be improved by using output probabilities for
every class label from the base-level classifiers. In such cases, the number of
input attributes in the meta-dataset is multiplied by the number of classes.
It has been shown that with stacking the ensemble performs (at best)
comparably to selecting the best classifier from the ensemble by cross-
validation [ Dzeroski and Zenko (2004) ] . In order to improve the existing
stacking approach, they employed a new multi-response model tree to learn
Search WWH ::




Custom Search