Information Technology Reference
In-Depth Information
The modification within the algorithm is made through two ways
The first way is during the modification of the weights of the examples: Indeed,
this strategy, with each iteration, is based on the opinion of the experts already
used (hypotheses of the former iterations) for the update of the weight of the
examples.
In fact, we do not compare only the class predicted by the hypothesis of
the current iteration with the real class but also the sum of the hypotheses
balanced from the first iteration to the current iteration. If this sum votes for
a class different from the real class, an exponential update such as in the case
of AdaBoost is applied to the badly classified example. Thus, this modification
lets the algorithm be interested only in the examples which are either badly
classified or not classified yet. So, results related to the improvement the speed of
convergence are awaited, similarly for the reduction of the error of generalization,
because of the richness of the space of hypotheses to each iteration.
The second way is during the error analysis ( t ) of the hypothesis to the
iteration T: Indeed, this other strategy is rather interested in the classifiers'
coecient ( hypothesis) to each iteration α ( t ).
In fact, this coecient depends on the apparent error analysis ( t ). This
method, with each iteration, takes into account hypotheses preceding the current
iteration during the calculation of ( t ). So the apparent error with each iteration
is the weight of the examples voted badly classified by the hypotheses weighted
of the former iterations by comparison to the real class.
Results in improving the error of generalization are expected since the vote
of each hypothesis (coecient α ( t )) is calculated from the other hypotheses.
3.4
Experiments
The objective of this part is to compare our new approach and especially
its contribution with the original approach of Adaboost and to look further
into this comparison by the choice of a version improved of Adaboost (Brown
Boost [14]).
Our Choice of BrownBoost was based on its robustness against the problems
of noisy data. In fact,BrownBoost is an adaptive algorithm which use a function
that depends on the iteration number K (execution time), the Current iteration
i, the number of times that the example has already been correctly predicted r,
and the probability of success 1
γ
α r =( k i 1
r )(1 / 2+ γ ) ( k/ 2) r (1 / 2
γ ) ( k/ 2) i 1+ r
insteadontheexponential
k/ 2
function.
So by a good estimation of K parameter BrownBoost is capable of avoiding
overfitting. The advantage of this approach is that the noised data would be
detected at some point, and their weights stop rising.
The comparison criterions chosen in this article are the error rate, the recall,
the p-value, the average gain compared to AdaBoost, the speed of convergence
and the sensitivity to noise.
 
Search WWH ::




Custom Search