Digital Signal Processing Reference
In-Depth Information
Table 1. The values of the discussed diversity measures for the cases described in Fig. 1
ρ
θ
av (↓) Dis av (↑) DF av (↓)
E(↑) KW(↑)
(↓)
GD(↑)
CFD(↑)
av (↓)
a)
1
0
1
-0.5
0
0
0
0
0
b)
0.11
0.44
0.33
0.17
0.67
0.15
0.10
0.4
0.67
c)
-0.33
0.67
0
-0.5
1
0.22
0
1
1
d)
-0.33
0.67
0.11
-0.33
1
0.22
0.03
0.75
0.83
e)
1
0
0.33
1
0
0
0.22
0
0
f)
-0.33
0.44
0.11
0
0.67
0.15
0.07
0.67
0.75
g)
0.56
0.22
0.11
0.17
0.33
0.07
0.10
0.5
0.5
h)
-1
0.67
0
-0.5
1
0.22
0
1
1
i)
-0.11
0.44
0
-0.33
0.67
0.15
0.03
1
1
j)
0.56
0.22
0
-0.33
0.33
0.07
0.03
1
1
k)
1
0
0
-0.5
0
0
0
1
0
3
New Measure for Ensemble Learning
Though it is argued that a diversity measure should not be another estimate for the
classification accuracy [2,18] , we think it is important and even necessary to take the
classification accuracy into consideration. The diversity measure should be either
positive or negative proportional to the generalization performance of the ensemble;
otherwise, it is of no use for evaluating the performance of the base classifiers in the
ensemble or the ensemble itself. Moreover, a diversity measure without considering
the classification accuracy should still be used together with a certain measure con-
cerning classification accuracy. Nevertheless, it is very difficult to make a tradeoff
between the diversity and the accuracy. The 'good' diversity and 'bad' diversity pro-
posed in [20] are also in line with this idea.
In this paper, we propose a new measure as follows where both the diversity and
the accuracy are considered
2
2
 
N
00
Ll
()
z
2
1
L
1
L
N

j
ik
,
DA
=−
1
sqrt
sqrt
.
(11)
 
 
LL
(
)
N
N
L
 
i
==+
1
ki
1
j
=
1
While the base classifiers should make errors on different objects, the double fault
measure provides a way to evaluate this aspect. The second term in the proposed
measure is similar to the double fault measure in that both of them are calculated
based on the number of objects misclassified by both base classifiers. However, this
term adopts the form of mean squared value in order to suppress large difference
among the numbers of misclassified objects by each base classifier pair. That is, a
large difference may deteriorate the performance of the ensemble.
On the other hand, for misclassified objects, the less the number of base classifiers
misclassifying them, the more diverse the ensemble is. This idea introduces the third
term in the proposed measure which also takes the form of mean squared value.
Search WWH ::




Custom Search