Information Technology Reference
In-Depth Information
explanation is that even if all the examples of training are already well classified,
Boosting tends to maximize the margins [20].
Following this, some studies try to modify the margin either by maximizing it
or by minimizing it with the objective of improving the performance of Boosting
against overfitting.
Several approaches followed such as AdaBoostReg [17] which tries to identify
and remove badly labeled examples, or to apply the constraint of the maximum
margin to examples supposed to be badly labeled, by using the Soft Margin.
In the algorithm, proposed by [9], the authors use a weighting diagram which
exploits a margin function that grows less quickly than the exponential function.
3.2.3
Modification of the Classifiers' Weight
During the performance evaluation of Boosting, researchers wondered about the
significance of the weights α ( t ) that AdaBoost associates with the produced
hypotheses.
However, they noted at the time of experiments on very simple data that the
error in generalization decreased further whereas the weak learning had already
provided all the possible hypotheses. In other words, when a hypothesis appears
several times, it votes finally with a weight, oce sum of all α ( t ), which is
perhaps absolute. So several researchers hoped to approach these values by a
nonadaptive process , such as locboost [15] an alternative to the construction of
the whole representations of experts which allows the coecients α ( t ) to depend
on the data.
3.2.4
Choice of Weak Learner
A question that several researchers posed against the problems of boosting is
that of weak learner and how to make a good choice of this classifier?
A lot of research moves towards the study of choosing the basic classifier
of boosting, such as GloBoost [24]. This approach use a weak learner which
produces only correct hypotheses. RankBoost [5] is also an approach which is
based on weak learner which accepts as data attributes functions of rank.
3.2.5
The Speed of Convergence
In addition to the problem of overfitting met by boosting in the modern
databases mentioned above, we find another problem : the speed of convergence
of Boosting especially AdaBoost.
Indeed, in the presence of noisy data, the optimal error of the training al-
gorithm used is reached after a long time. In other words, AdaBoost “loses”
iterations, and thus time, with reweighing examples which do not deserve in
theory any attention, since it is a noise.
Thus research was made to detect these examples and improve the perfor-
mance of Boosting in terms of convergence such as: iBoost [22] which aims at
specializing weak hypotheses on the examples supposed to be correctly classified.
 
Search WWH ::




Custom Search