Graphics Reference
In-Depth Information
computed by 1
r ij ). These outputs are represented by a score matrix R:
r 12
···
r 1 M
r 21
− ···
r 2 M
R
=
(5.3)
.
r M 1 r M 2 ··· −
.
The final output is derived from the score matrix by different aggregation models.
The most commonly used and simplest combination, also considered in the experi-
ments of this topic, is the application of a voting strategy:
Class
=
argmax i = 1 ,..., M
s ij
(5.4)
1
j = i M
where s ij is 1 if r ij >
r ji and 0 otherwise. Therefore, the class with the largest number
of votes will be predicted. This strategy has proved to be competitive with different
classifiers obtaining similar results in comparison with more complex strategies [ 21 ].
5.5 Empirical Analysis of Noise Filters and Robust Strategies
In this section we want to illustrate the advantages of the noise approaches described
above.
5.5.1 Noise Introduction
In the data sets we are going to use (taken from Chap. 2 ) , as in most of the real-world
data sets, the initial amount and type of noise present is unknown. Therefore, no
assumptions about the base noise type and level can be made. For this reason, these
data sets are considered to be noise free, in the sense that no recognizable noise has
been introduced. In order to control the amount of noise in each data set and check
how it affects the classifiers, noise is introduced into each data set in a supervised
manner. Four different noise schemes proposed in the literature, as explained in
Sect. 5.2 , are used in order to introduce a noise level x% into each data set:
1. Introduction of class noise.
Uniform class noise [ 84 ] x% of the examples are corrupted. The class labels
of these examples are randomly replaced by another one from the M classes.
Pairwise class noise [ 100 , 102 ]Let X be the majority class and Y the second
majority class, an example with the label X has a probability of x
/
100 of
being incorrectly labeled as Y .
 
Search WWH ::




Custom Search