Information Technology Reference
In-Depth Information
This makes us think that due to the way of calculating the apparent error, the
algorithm reaches stability more quickly. Finally, we remark that BrownBoost
does'nt converge even after 1000 iterations. This remark prove the fact that the
BrownBoost problem is the speed of convergence.
These results are confirmed by graphic 4. In fact, this graphic show us that
firstly, the average error in generalization of AdaBoosthyb is less than the average
error of other algorithms and secondly adaboosthyb converge more quickly than
BrownBoost and specially AdaBoost M1 that don't converge even after 1000
iterations.
3.5
Conclusion
In this paper, we proposed an improvement of AdaBoost which is based
on the exploitation of the hypotheses already built with the preceding iterations.
The experiments carried out and the results show that this approach improves
the performance of AdaBoost in error rate, in recall, in speed of convergence and
in sensibility to the noise. However, it proved that this same approach remains
sensitive to the noise.
We did an experimental comparison of the proposed method with BrownBoost
(a new method known that it improves AdaBoost M1 with noisy data). The
results show that our proposed method improves the recall rates and the speed
of convergence of BrownBoost overall the 15 data sets. The results show also
that BrownBoost gives better error rates with some datasets, and our method
gives better error rates with other data sets. The same conclusion is reached
with noisy data.
To confirm the experimental results obtained, more experimentations are
planned. We are working on further databases that were considered by other
researchers in theirs studies of the boosting algorithms. We plan to choose weak
learning methods other than C4.5, in order to see whether the obtained results
are specific to C4.5 or general. We plan to compare the proposed algorithm to
new variants of boosting, other than AdaBoost M1. We can consider especially
those that improve the speed of convergence like IAdaBoost and RegionBoost.
In the case of encouraging comparisons, a theoretical study on convergence will
be done to confirm the results of the experiments.
Another objective which seems important to us consists in improving this
approach against the noisy data. In fact, the emergence and the evolution of
the modern databases force the researchers to study and improve the boosting's
capacities of tolerance to the noise . Indeed, these modern databases contain a
lot of noise, due to new technologies of data acquisition such as the Web. In
parallel, studies such as [5], [16] and [18] , show that AdaBoost tends to overfit
the data and especially the noise. So, a certain number of recent work tried to
limit these risks of overfitting. These improvements are based primarily on the
concept that AdaBoost tends to increase the weight of the noise in an exponential
way. Thus two solutions were proposed to reduce the sensibility to noise. One is
by detecting these data and removing them based on the heuristic and selection
Search WWH ::




Custom Search