Database Reference
In-Depth Information
Then, the trained ensemble is employed to generate a new training set by
replacing the desired class labels of the original training examples with the
output from the trained ensemble. Some extra training examples are also
generated from the trained ensemble and added to the new training set.
Finally, a C4.5 decision tree is grown from the new training set. Since its
learning results are decision trees, the comprehensibility of NeC4.5 is better
than that of neural network ensembles.
Using several inducers can solve the dilemma which arises from the
“no free lunch” theorem. This theorem implies that a certain inducer
will be successful only insofar its bias matches the characteristics of the
application domain [Brazdil et al . (1994)]. Thus, given a certain application,
the practitioner need to decide which inducer should be used. Using the
multi-inducer obviate the need to try each one and simplifying the entire
process.
9.5.6
Measuring the Diversity
As stated above, it is usually assumed that increasing diversity may decrease
ensemble error [ Zenobi and Cunningham (2001) ] . For regression problems,
variance is usually used to measure diversity [ Krogh and Vedelsby (1995) ] .
In such cases it can be easily shown that the ensemble error can be reduced
by increasing ensemble diversity while maintaining the average error of a
single model.
In classification problems, a more complicated measure is required to
evaluate the diversity. There have been several attempts to define diversity
measure for classification tasks.
In the neural network literature, two measures are presented for
examining diversity:
Classification coverage: An instance is covered by a classifier, if it yields
a correct classification.
Coincident errors: A coincident error amongst the classifiers occurs when
more than one member misclassifies a given instance.
Based on these two measures, Sharkey (1997) defined four diversity levels:
Level 1 — No coincident errors and the classification function is
completely covered by a majority vote of the members.
Level 2 — Coincident errors may occur, but the classification function is
completely covered by a majority vote.
Search WWH ::




Custom Search