Database Reference
In-Depth Information
feature combinations in general is out of the question. Several alternative
search strategies for FS, employing the cost functions from Section 2.3.2, will
be summarized with regard to achievable performance and required compu-
tational effort.
First-Order Selection Techniques. One simple but often effective way of find-
ing a suboptimum solution with minimum effort is to compute an individual
figure of merit for each feature. This first-order approach neglects possible
higher-order correlations between feature pairs or feature tuples. For assess-
ment or figure of merit computation, for instance, one of the cost function
given in the previous subsection has to be applied. However, the cost func-
tion in this simplified case will be computed separately for each feature. Three
permutations are basically feasible:
The figure of merit is computed for a selected feature and a selected com-
bination of classes, i.e., the feature contribution to pairwise class discrimi-
nation is assessed. For instance, the measure q x l ij
could be computed here.
For each class pair, features are ranked according to their individual merit.
Selection from these rank tables can be achieved, for instance, by choos-
ing all features in first-rank position. Table 2.1 gives an example of this
first-order selection scheme for the well-known Iris data. Obviously, for
first-rank position R , features 3 and 4 will be selected. The method can be
computed very quickly, but the rank table grows for given feature number
M and class number L by M ∗
( L ( L −
1) / 2).
The figure of merit is computed for a selected feature and for the discrim-
ination of one class versus all others. The corresponding rank table grows
for given feature number M and class number L by M ∗ L .
Computing the figure of merit with regard to discriminating all classes for
each feature returns a single column with M elements.
As shown in Table 2.1, the parametric overlap measure q x l ij and its vari-
ants can serve for the three approaches of fast first-order feature selection.
If the parametric assumption is met, then this simple scheme can be very
effective. However, in many practical cases, even for the one-dimensional dis-
tributions of the individual features, a nonparametric nature can be observed.
An effective remedy for this situation is the application of, e.g., the overlap
Table 2.1. Rank table from first-order assessme nt for Iris data.
Feature
R
C 1-2
R
C 1-3
R
C 2-3
x 1
4
1,020
3
1,482
3
0,442
x 2
3
1,065
4
0,890
4
0,255
x 3
2
4,139
1
5,451
2
1,218
x 4
1
4,387
2
5,180
1
1,660
 
Search WWH ::




Custom Search