Database Reference
In-Depth Information
Table 2.3. Feature selection protocol for Iris data.
Selection strategy:
SBS
Assessment measure:
Separability qs
1234 0.90667
-234 0.94667
--34 0.96000
---4 0.00000
Optimum quality
:
0.96
Significant Features:
3 4
caution, that the higher-order methods are usually superior. But, of course,
it is possible that the simple first-order scheme runs on a configuration that is
neglected by the higher-order methods due to the search strategy and returns
a better solution. Instead of strict top-down or bottom-up processing, as met
in SBS or SFS, an alternation between feature rejection and selection during
the search process can be found in other approaches, e.g., branch-and-bound
approaches or floating search.
Further heuristic search strategies, employing stochastic methods, e.g.,
simulated annealing (SA) [2.1] or Boltzmann machines (BM) [2.1], as well
as bio-inspired techniques for optimization, e.g., genetic algorithms (GA) in
particular and evolutionary strategies (ES) in general, can be applied for
FS [2.45], [2.11]. Also, multiobjective optimization can be merged with the
GA/ES approach [2.11]. This subject is pursued in ongoing work.
The permanent elimination of redundant and irrelevant features from the
sample set by FS provides an effective means of dimensionality reduction.
However, the crispness of the selection process can lead to stronger sensitiv-
ity with regard to variances in the feature representation in generalization
due to the loss of information contained in the discarded features. The issue
of the stability of the FS solution and the underlying maximum of the cost
function is raised here. It is especially painful for data analysis and knowl-
edge acquisition, if for minor changes in the data entirely different features
are selected. The methods discussed so far are specialized to classification
problems and require revision and enhancement with regard to stability and
data analysis.
Visualization Techniques and Dedicated Tools. In contrast to the state
of the art, e.g., static scatter plots, in the methodology pursued in this re-
search work, the achieved projections are the baseline for interactive human
analysis. Interactive CAD-like visualization techniques, e.g., interactive nav-
igation, diverse component plots, grid plots, and attribute plots, support
human perception and analysis [2.24]. Figure. 2.17 gives a taxonomy of rel-
evant visualization techniques for large high-dimensional data. For instance,
at each projection point, the value of a selected variable can be plotted in
a Hinton diagram style, i.e., the variable value is coded by the side length
 
Search WWH ::




Custom Search