Database Reference
In-Depth Information
have demonstrated the existence and correctness of the “conservation law”
[ Schaffer (1994) ] or “no free lunch theorem” [ Wolpert (1996) ] : if one inducer
is better than another in some domains, then there are necessarily other
domains in which this relationship is reversed.
The “no free lunch theorem” implies that for a given problem, a
certain approach can yield more information from the same data than other
approaches.
A distinction should be made between all the mathematically possible
domains, which are simply a product of the representation languages used,
and the domains that occur in the real world, and are therefore the ones
of primary interest [Rao et al . (1995)]. Without doubt there are many
domains in the former set that are not in the latter, and average accuracy
in the realworld domains can be increased at the expense of accuracy in
the domains that never occur in practice. Indeed, achieving this is the
goal of inductive learning research. It is still true that some algorithms
will match certain classes of naturally occurring domains better than other
algorithms, and so achieve higher accuracy than these algorithms. While
this may be reversed in other realworld domains, it does not preclude an
improved algorithm from being as accurate as the best in each of the domain
classes.
Indeed, in many application domains, the generalization error of even
the best methods is far above 0%, and the question of whether it can be
improved, and if so how, is an open and important one. One aspect in
answering this question is determining the minimum error achievable by
any classifier in the application domain (known as the optimal Bayes error).
If existing classifiers do not reach this level, new approaches are needed.
Although this problem has received considerable attention (see for instance
[ Tumer and Ghosh (1996) ] ), no generally reliable method has so far been
demonstrated.
The “no free lunch” concept presents a dilemma to the analyst
approaching a new task: Which inducer should be used?
If the analyst is looking for accuracy only, one solution is to try each
one in turn, and by estimating the generalization error, to choose the one
that appears to perform best [ Schaffer (1994) ] . Another approach, known as
multistrategy learning [ Michalski and Tecuci (1994) ] , attempts to combine
two or more different paradigms in a single algorithm. Most research in
this area has been concerned with combining empirical approaches with
analytical methods (see for instance [ Towell and Shavlik (1994) ] .Ideally,a
multistrategy learning algorithm would always perform as well as the best
Search WWH ::




Custom Search