Graphics Reference
In-Depth Information
Γ
These
resulting classifiers, are then used to tag each instance in the training
set D T as either correct or mislabeled, by comparing the training label with that
assigned by the classifier.
Add to D N the noisy instances identified in D T using a voting scheme, taking
into account the correctness of the labels obtained in the previous step by the
Γ
classifier built. For the IPF filter we use the majority vote scheme.
Add to D G a percentage y of the good data in D T . This step is useful when we deal
with large data sets because it helps to reduce them faster. We do not eliminate
good data with the IPF method in our experimentation (we set y
=
0, so D G is
always empty) and nor do we lose generality.
Remove the noisy instances and the good data from the training set: D T
D T \{
D N
D G }
.
At the end of the iterative process, the filtered data is formed by the remaining
instances of D T and the good data of D G ; that is, D T
D G .
A particularity of the voting schemes in IPF is that a noisy instance should also be
misclassified by the model which was induced in the subset containing that instance
as an additional condition. Moreover, by varying the required number of filtering
iterations, the level of conservativeness of the filter can also be varied in both schemes,
consensus and majority.
5.3.4 More Filtering Methods
Apart from the three aforementioned filtering methods, we can find many more in
the specialized literature. We try to provide a helpful summary of the most recent
and well-known ones in the following Table 5.1 . For the sake of brevity, we will
not carry out a deep description of these methods as done in the previous sections.
A recent categorization of the different filtering procedures made by Frenay and
Verleysen [ 19 ] will be followed as it matches our descriptions well.
5.4 Robust Learners Against Noise
Filtering the data has also one major drawback: some instances will be dropped from
the data sets, even if they are valuable. Instead of filtering the data set or modifying
the learning algorithm, we can use other approaches to diminish the effect of noise
in the learned model. In the case of labeled data, one powerful approach is to train
not a single classifier but several ones, taking advantage of their particular strengths.
In this section we provide a brief insight into classifiers that are known to be robust
to noise to a certain degree, even when the noise is not treated or cleansed. As
said in Sect. 5.1 C4.5 has been considered as a paradigmatic robust learner against
noise. However, it is also true that classical decision trees have been considered
 
Search WWH ::




Custom Search