Graphics Reference
In-Depth Information
Apart from the uniform class noise, the NAR label noise has widely studied in
the literature. An example is the pairwise label noise, where two selected class labels
are chosen to be labeled with the other with certain probability. In this pairwise label
noise (or pairwise class noise) only two positions of the
matrix are nonzero outside
of the diagonal. Another problem derived from the NAR noise model is that it is not
trivial to decide whether the class labels are useful or not.
The third and last noise model is the noisy not at random (NNAR), where the input
attributes somehow affect the probability of the class label being erroneous as shown
in Fig.
5.3
c. An example of this illustrated by Klebanov [
49
] where evidence is given
that difficult samples are randomly labeled.It also occurs that those examples similar
to existing ones are labeled by experts in a biased way, having more probability of
being mislabeled the more similar they are. NNAR model is the more general case
of class noise [
59
] where the error
E
depends on both
X
and
Y
and it is the only
model able to characterize mislabelings in the class borders or due to poor sampling
density. As shown in [
19
] the probability of error is much more complex than in the
two previous cases as it has to take into account the density function of the input over
the input feature space
γ
X
when continuous:
p
n
=
(
=
)
=
C
×
(
=
|
=
)
(
=
|
=
,
=
)
.
P
E
1
P
X
x
Y
y
P
E
1
X
x
Y
y
dx
c
i
∈
x
∈X
(5.2)
As a consequence the perfect identification and estimation of the NNAR noise is
almost impossible, relying in approximating it from the expert knowledge of the
problem and the domain.
In the case of attribute noise, the modelization described above can be extended
and adapted. In this case, we can distinguish three possibilities as well:
•
When the noise appearance does not depend either on the rest of the input features'
values or the class label the NCAR noise model applies. This type of noise can
occur when distortions in the measures appear at random, for example in faulty
hand data insertion or network errors that do not depend in the data content itself.
•
When the attribute noise depends on the true value
x
i
but not on the rest of input
values
x
1
,...,
x
n
or the observed class label
y
the NAR model is
applicable. An illustrative example is when the different temperatures affect their
registration in climatic data in a different way depending on the proper temperature
value.
x
i
−
1
,
x
i
+
1
,...,
•
In the last case the noise probability will depend on the value of the feature
x
i
but also on the rest of the input feature values
x
1
,...,
x
n
.Thisis
a very complex situation in which the value is altered when the rest of features
present a particular combination of values, as in medical diagnosis when some test
results are filled by an expert prediction without conducting the test due to high
costs.
For the sake of brevitywe will not develop the probability error equations here as their
expressions would vary depending on the nature of the input feature, being different
x
i
−
1
,
x
i
+
1
,...,