Feature Evaluation by Filter, Wrapper, and Embedded Approaches - Feature Selection for Data and Pattern Recognition

Information Technology Reference

In-Depth Information

Another disadvantage of this approach is in computational costs required.

Execution of the learning algorithm for many subsets of features can become unfea-

sible, not only when there are very high numbers of attributes to consider, but also

in cases when the training step is complex and time-consuming even for smaller

numbers of variables. For example artificial neural networks deal much better with

more than necessary inputs than for situations when their number is low.

When an inducer is able by itself to select some features while disregarding others

due to some additional procedures dedicated to dimensionality reduction, they cannot

be used for the wrapper mode to play its role. If such processing is employed, it results

in an embedded approach.

Wrapper model can be used not only for feature selection or reduction, but for

other purposes, to better adjust some parameters of a classification system. An exam-

ple of such procedure constitutes establishing preference orders for values of condi-

tional attributes in Dominance-based Rough Set Approach, when there is insufficient

domain knowledge for such definitions [ 39 ].

3.3.3 Embedded Solutions

Several of predictors have their own, inherent mechanisms, built-in in the learning

algorithm, dedicated to feature selection. When such mechanism is actively used we

have an embedded solution [ 23 ].

As an example of this category there can be given input pruning for artificial

neural networks [ 19 , 21 ] that leads to establishing by repetitive computations which

of network inputs have very low influence on the network outputs. In decision trees in

fact at each node some feature is selected, and this decision is a constituent element

of the algorithm, cannot be simply separated from it. For rough sets such function is

played by relative reducts, subsets of attributes which guarantee the same quality of

approximation as the entire set of variables [ 26 ].

When a set of all relative reducts is treated as yet another entity, another form

of expression for available knowledge on features extracted from instances, we can

assign weights [ 25 ] and define some quality measures for them [ 40 , 42 ], to be used

in feature selection. These measures and weights can take into account how often

each attribute is included in reducts of specific cardinalities, and the same statistics

for other variables included in the same reducts. This kind of processing leads to

ordering of features, which can be interpreted as their ranking.

3.3.4 Ranking of Features

When we proceed through the entire set of available features with an application

of any of the aforementioned approaches to feature selection, either single or in

combinations, as a result these features become ordered by a value of some score or

Search WWH ::

Custom Search

Home