Database Reference
In-Depth Information
direction, the algorithm looks for the directions that optimize that index.
As the Gaussian distribution is the least interesting distribution, projection
indices usually measure some aspect of non-Gaussianity.
13.3.3
Wrappers
The wrapper strategy for feature selection uses an induction algorithm
to evaluate feature subsets. The motivation for this strategy is that the
induction method that will eventually use the feature subset should provide
a better predictor of accuracy than a separate measure that has an entirely
different inductive bias [ Langley (1994) ] .
Feature wrappers are often better than filters since they are tuned
to the specific interaction between an induction algorithm and its training
data. Nevertheless, they tend to be much slower than feature filters because
they must repeatedly perform the induction algorithm.
13.3.3.1 Wrappers for Decision Tree Learners
The wrapper general framework for feature selection, has two degrees of
feature relevance definitions that are used by the wrapper to discover
relevant features [ John et al . (1994) ] .Afeature X i is said to be strongly
relevant to the target concept(s) if the probability distribution of the
class values, given the full feature set, changes when X i is eliminated.
Afeature X i is said to be weakly relevant if it is not strongly relevant
and the probability distribution of the class values, given some subset which
contains X i , changes when X i is removed. All features that are not strongly
or weakly relevant are irrelevant.
Vafaie and De Jong (1995) and Cherkauer and Shavlik (1996) have both
applied genetic search strategies in a wrapper framework in order to improve
the performance of decision tree learners. Vafaie and De Jong (1995) present
a system that has two genetic algorithm driven modules. The first performs
feature selection while the second module performs constructive induction,
which is the process of creating new attributes by applying logical and
mathematical operators to the original features.
13.4 Feature Selection as a means of Creating Ensembles
The main idea of ensemble methodology is to combine a set of models,
each of which solves the same original task, in order to obtain a better
composite global model, with more accurate and reliable estimates or
Search WWH ::




Custom Search