Feature Selection - Data Mining with Decision Trees: Theory and Applications

Database Reference

In-Depth Information

direction, the algorithm looks for the directions that optimize that index.

As the Gaussian distribution is the least interesting distribution, projection

indices usually measure some aspect of non-Gaussianity.

13.3.3

Wrappers

The wrapper strategy for feature selection uses an induction algorithm

to evaluate feature subsets. The motivation for this strategy is that the

induction method that will eventually use the feature subset should provide

a better predictor of accuracy than a separate measure that has an entirely

different inductive bias [ Langley (1994) ] .

Feature wrappers are often better than filters since they are tuned

to the specific interaction between an induction algorithm and its training

data. Nevertheless, they tend to be much slower than feature filters because

they must repeatedly perform the induction algorithm.

13.3.3.1 Wrappers for Decision Tree Learners

The wrapper general framework for feature selection, has two degrees of

feature relevance definitions that are used by the wrapper to discover

relevant features [ John et al . (1994) ] .Afeature X i is said to be strongly

relevant to the target concept(s) if the probability distribution of the

class values, given the full feature set, changes when X i is eliminated.

Afeature X i is said to be weakly relevant if it is not strongly relevant

and the probability distribution of the class values, given some subset which

contains X i , changes when X i is removed. All features that are not strongly

or weakly relevant are irrelevant.

Vafaie and De Jong (1995) and Cherkauer and Shavlik (1996) have both

applied genetic search strategies in a wrapper framework in order to improve

the performance of decision tree learners. Vafaie and De Jong (1995) present

a system that has two genetic algorithm driven modules. The first performs

feature selection while the second module performs constructive induction,

which is the process of creating new attributes by applying logical and

mathematical operators to the original features.

13.4 Feature Selection as a means of Creating Ensembles

The main idea of ensemble methodology is to combine a set of models,

each of which solves the same original task, in order to obtain a better

composite global model, with more accurate and reliable estimates or

Search WWH ::

Custom Search

Home