Information Technology Reference
In-Depth Information
Table 4.1 The features (first column) sorted by weight (second column); cumulative percentage of
weight (third column); the accuracy, by subset, of EDBFM method (fourth column); the accuracy,
by subset, on a random weight model (fifth column)
1NN Acc.% 1NN Acc.%
Rank Weight Weight Cum. Norm Cum. Norm
%Cum.
(BVQ-FR)
(rand. rank)
Feat. 1 0.557
27.41
6.08
51.79
1to2 0.461
50.10
100
57.27
1to3 0.081
54.07
99.09
56.57
1to4 0.075
57.76
97.40
57.82
1to5 0.063
60.82
95.94
56.26
1to6 0.062
63.88
95.15
59.22
1to7 0.058
66.74
91.54
59.16
1to8 0.057
69.53
90.30
59.54
1to9 0.055
72.18
90.41
58.88
1to10 0.054
74.83
88.61
58.66
… … … …
1to22 0.035 100.00 79.93 79.93
The accuracies of fourth and fifth columns, normalized to 100%, are also plotted aside
where AreaFR is the area underneath the accuracy curve relative to BVQ-FR,
obtained by summing the accuracy values at each feature subset, namely the accu-
racy value in the BVQ-FR column of Table 4.1 . Analogously AreaRP is the area
underneath the curve obtained by random permutation of features. The AreaMax is
the area underneath a theoretical curve reaching the 100% possible accuracy with
the top rank feature, thereafter remaining constant up to full dataset. The AreaFR
is expected to be geometrically bounded between the other two curves, mathemati-
cally 0
ˆ
1, where
ˆ
represents a relative area . When AreaFR approximates
ˆ =
AreaMax ,
1, the ranking model approximates an ideal order of the features,
where the first feature is the most significant and contains all the weight to discrim-
inate between classes. Conversely, when AreaFR approximates AreaRP,
0, the
ranking model approximates to a random ordering of the features and is therefore
useless.
ˆ =
4.5.2 Benchmarking the EDBFM Ranking Method
In this section the EDBFM ranking method is tested on complex and real world
datasets and the rank models are compared to other methods. Testing includes two
phases:
studying EDBFM performance when different FE algorithms are included;
comparing EDBFM ranking and heuristic methods.
 
 
Search WWH ::




Custom Search