Graphics Reference
In-Depth Information
2. Introduction of attribute noise
Uniform attribute noise [ 100 , 104 ] x% of the values of each attribute in the
data set are corrupted. To corrupt each attribute A i , x% of the examples in
the data set are chosen, and their A i value is assigned a random value from
the domain
D i of the attribute A i . An uniform distribution is used either for
numerical or nominal attributes.
Gaussian attribute noise This scheme is similar to the uniformattribute noise,
but in this case, the A i values are corrupted, adding a random value to them
following Gaussian distribution of mean
0 and standard deviation =( max-
min ) /5 , being max and min the limits of the attribute domain (
=
D i ). Nominal
attributes are treated as in the case of the uniform attribute noise.
In order to create a noisy data set from the original, the noise is introduced into
the training partitions as follows:
1. A level of noise x %, of either class noise (uniform or pairwise) or attribute noise
(uniform or Gaussian), is introduced into a copy of the full original data set.
2. Both data sets, the original and the noisy copy, are partitioned into 5 equal folds,
that is, with the same examples in each one.
3. The training partitions are built from the noisy copy, whereas the test partitions
are formed from examples from the base data set, that is, the noise free data set.
We introduce noise, either class or attribute noise, only into the training sets
since we want to focus on the effects of noise on the training process. This will be
carried out observing how the classifiers built from different noisy training data for a
particular data set behave, considering the accuracy of those classifiers, with the same
clean test data. Thus, the accuracy of the classifier built over the original training
set without additional noise acts as a reference value that can be directly compared
with the accuracy of each classifier obtained with the different noisy training data.
Corrupting the test sets also affects the accuracy obtained by the classifiers and
therefore, our conclusions will not only be limited to the effects of noise on the
training process.
The accuracy estimation of the classifiers in a data set is obtained by means of 5
runs of a stratified 5-FCV. Hence, a total of 25 runs per data set, noise type and level
are averaged. 5 partitions are used because, if each partition has a large number of
examples, the noise effects will be more notable, facilitating their analysis.
The robustness of each method is estimated with the relative loss of accuracy
(RLA) (Eq. 5.5 ), which is used to measure the percentage of variation of the accuracy
of the classifiers at a concrete noise level with respect to the original case with no
additional noise:
Acc 0%
Acc x %
Acc 0%
(5.5)
RLA x % =
,
where RLA x % is the relative loss of accuracy at a noise level x %, Acc 0% is the
test accuracy in the original case, that is, with 0% of induced noise, and Acc x % is
the test accuracy with a noise level x %.
 
Search WWH ::




Custom Search