Graphics Reference
In-Depth Information
5.5.2 Noise Filters for Class Noise
The usage of filtering is claimed to be useful in the presence of noise. This section
tries to show whether this claim is true or not and to what extent. As a simple but
representative case of study, we show the results of applying noise filters based on
detecting and eliminating mislabeled training instances. We want to illustrate how
applying filters is a good strategy to obtain better results in the presence of even low
amounts of noise. As filters are mainly designed for class noise, we will focus on the
two types of class noise described in this chapter: the uniform class noise and the
pairwise class noise.
Three popular classifiers will be used to obtain the accuracy values that are C4.5,
Ripper and a SVM. Their selection is not made at random: SVMs are known to be
very accurate but also sensitive to noise. Ripper is a rule learning algorithm able to
perform averagely well, but as we saw in Sect. 5.4 rule learners are also sensitive to
noise when they are not designed to cope with it. The third classifier is C4.5 using
the pruning strategy, that it is known for diminishing the effects of noise in the final
tree. Table 5.2 shows the average results for the three noise filters for each kind of
class noise studied. The amount of noise ranges from 5 to 20%, enough to show the
differences between no filtering (labeled as “None”) and the noise filters. The results
shown are the average over all the data sets considered in order to ease the reading.
The evolution of the results and their tendencies can be better depicted by using a
graphical representation. Figure 5.4 a shows the performance of SVMfroman amount
of 0% of controlled pairwise noise to the final 20% introduced. The accuracy can be
seen to drop from an initial amount of 90-85% by only corrupting 20% of the class
labels. The degradation is even worse in the case of uniform class noise depicted
in Fig. 5.4 b, as all the class labels can be affected. The evolution of not using any
Table 5.2 Filtering of class noise over three classic classifiers
Pairwise class noise Uniform random class noise
0% 5% 10% 15% 20% 0% 5% 10% 15% 20%
None
90.02 88.51 86.97 86.14 84.86 90.02 87.82 86.43 85.18 83.20
SVM EF
90.49 89.96 89.07 88.33 87.40 90.49 89.66 88.78 87.78 86.77
CVCF
90.56 89.86 88.94 88.28 87.76 90.48 89.56 88.72 87.92 86.54
IPF
90.70 90.13 89.37 88.85 88.27 90.58 89.79 88.97 88.48 87.37
None
82.46 81.15 80.35 79.39 78.49 82.46 79.81 78.55 76.98 75.68
Ripper EF
83.36 82.87 82.72 82.43 81.53 83.46 83.03 82.87 82.30 81.66
CVCF
83.17 82.93 82.64 82.03 81.68 83.17 82.59 82.19 81.69 80.45
IPF
83.74 83.59 83.33 82.72 82.44 83.74 83.61 82.94 82.94 82.48
None
83.93 83.66 82.81 82.25 81.41 83.93 82.97 82.38 81.69 80.28
C4.5
EF
84.18 84.07 83.70 83.20 82.36 84.16 83.96 83.53 83.38 82.66
CVCF
84.15 83.92 83.24 82.54 82.13 84.15 83.61 83.00 82.84 81.61
IPF
84.44 84.33 83.92 83.38 82.53 84.44 83.89 83.84 83.50 82.72
 
 
Search WWH ::




Custom Search