Information Technology Reference
In-Depth Information
to improve by voting techniques the performance of a single classifier. These
aggregation methods are good for compromised Skew-variance, thanks to the
three fundamental reasons explained in [6]. These methods of aggregation are
divided into two categories. The first category refers to those which merge preset
classifiers, such as simple voting [2], the weighted voting [2], and the weighted
majority voting [12]. The second category consists of those which merge classi-
fiers according to data during the training, such as adaptive strategies (Boosting)
and the basic algorithm AdaBoost [21] or random strategies (Bagging) [3].
We are interested in the method of Boosting, because of the comparative
study [7] that shows, in little noise, AdaBoost is seemed to be working against
the overfitting. In fact, AdaBoost tries to optimize directly the weighted votes.
This observation has been proved not only by the fact that the empirical error
on the training set decreases exponentially with iterations, but also by the fact
that the error in generalization also decreases, even when the empirical error
reached its minimum. However, this method is blamed because of overfitting,
and the speed of convergence especially with noise. In the last decade, many
studies focused on the weaknesses of AdaBoost and proposed its improvement.
The important improvements were carried on the modification of the weight of
examples [19], [18], [1], [20], [14], [8], the modification of the margin [9], [20],
[17], the modification of the classifiers' weight [15], the choice of weak learning
[5], [24] and the speed of convergence [22], [13], [18]. In this paper, we propose
a new improvement to the basic Boosting algorithm AdaBoost. This approach
aims exploiting assumptions generated with the former iterations of AdaBoost
to act both on the modification of the weight of examples and the modification
of the classifiers' weight. By exploiting these former assumptions, we think that
we will avoid the re-generation of a same classifier within different iterations
of AdaBoost. Thus, consequently, we expect a positive effect on the improve-
ment of the speed of convergence. The paper is organized in three sections. In
the following section, we describe the studies whose purpose is to improve the
Boosting against its weaknesses. In the third section, we describe our improve-
ment of boosting by exploiting former assumptions. In the fourth section, we
present an experimental study of the proposed improvement by comparing its
error in generalization, its recall and its speed of convergence with AdaBoost, on
many real databases. We study also the behavior of the proposed improvement
on noisy data. We present also comparative experiments of our proposed method
with BrownBoost (a new method known that it improves AdaBoost with noisy
data). Lastly, we give our conclusions and perspectives.
3.2
State of Art
Due to the finding of some weaknesses, such as the overfitting and the speed
of convergence, met by the basic algorithm of boosting AdaBoost, several re-
searchers have tried to improve it.
Therefore, we make a study of main methods having as purpose to improve
boosting relatively to these weaknesses. With this intention, the researchers try
Search WWH ::




Custom Search