Information Technology Reference
In-Depth Information
Depending on the organisation of a search process, feature selection algorithms are
typically categorised as belonging with filters, wrappers, or embedded approaches.
There are also constructed combinations of approaches, where for example firstly
a filter is employed, then wrapper, or when a wrapper is used as a filter. It is also
possible to apply some algorithm to obtain ranking of attributes, basing on which
feature selection or reduction is next executed.
3.3.1 Filters
Filters are completely separate processes to systems used for classification, working
independently on their performance and other parameters. They can be treated as
kind of pre-processing procedures. They exploit information contained in input data
sets looking for example for information gain, entropy, consistency [ 9 ].
One of popular algorithms from this group is Relief, in its original form invented
for binary classification (later modified to allow for multiple classes) [ 43 ]. Relief
assigns scores to variables depending on how well they discern decision classes. It
randomly samples the training set, looking for the two examples that are nearest to
the one selected, one from the same class (near-hit) while the other from the opposite
class (near-miss), and basing on this iteratively accumulates weights for attributes.
One of the drawbacks of Relief algorithm is the fact that it looks for all relevant
features and cannot discern redundant features, even when they are relevant in a very
low degree, thus each variable has some weight assigned.
The general nature of filters makes them applicable in all cases, yet the fact that
they totally disregard the performance of a classification system employing the set
of selected variables causes typically worse results than other approaches and it is
considered as a disadvantage.
3.3.2 Wrappers
In a wrapper approach to feature selection it is argued that the best evaluation of some
candidate variable subset is obtained by checking its usefulness in classification, as
the estimated predictive accuracy is typically considered to be the most important
indicator of relevance for attributes [ 22 ]. The induction algorithm can be run over
the entire training set and then measured against the testing set, or a cross-validation
method can be employed.
Since the search and selection process is adjusted to specific characteristic of the
inducer, they can show a bias, resulting in an increased performance for the chosen
classifier but worse results for another, especially when they significantly vary in
properties. In other words wrappers tend to construct sets of attributes which are
customised, tailored to some particular task and some particular system.
 
Search WWH ::




Custom Search