Database Reference
In-Depth Information
9.5.4.2.3
Collective-Performance-based Strategy
Cunningham and Carney (2000) introduced an ensemble feature selection
strategy that randomly constructs the initial ensemble. Then, an iterative
refinement is performed based on a hill-climbing search in order to improve
the accuracy and diversity of the base classifiers. For all the feature subsets,
an attempt is made to switch (include or delete) each feature. If the resulting
feature subset produces a better performance on the validation set, that
change is kept. This process is continued until no further improvements are
obtained. Similarly, Zenobi and Cunningham (2001) suggest that the search
for the different feature subsets will not be solely guided by the associated
error but also by the disagreement or ambiguity among the ensemble
members.
Tsymbal et al. (2004) compare several feature selection methods that
incorporate diversity as a component of the fitness function in the search for
the best collection of feature subsets. This study shows that there are some
datasets in which the ensemble feature selection method can be sensitive
to the choice of the diversity measure. Moreover, no particular measure is
superior in all cases.
Gunter and Bunke (2004) suggest employing a feature subset search
algorithm in order to find different subsets of the given features. The feature
subset search algorithm not only takes the performance of the ensemble into
account, but also directly supports diversity of subsets of features.
Combining genetic search with ensemble feature selection was also
examined in the literature. Opitz and Shavlik (1996) applied GAs to
ensembles using genetic operators that were designed explicitly for hidden
nodes in knowledge-based neural networks. In a later research, Opitz (1999)
used genetic search for ensemble feature selection. This genetic ensemble
feature selection (GEFS) strategy begins by creating an initial population of
classifiers where each classifier is generated by randomly selecting a different
subset of features. Then, new candidate classifiers are continually produced
by using the genetic operators of crossover and mutation on the feature
subsets. The final ensemble is composed of the most fitted classifiers.
9.5.4.2.4
Feature Set Partitioning
Feature set partitioning is a particular case of feature subset-based ensem-
bles in which the subsets are pairwise disjoint subsets. At the same time,
feature set partitioning generalizes the task of feature selection which aims
to provide a single representative set of features from which a classifier is
Search WWH ::




Custom Search