Information Technology Reference
In-Depth Information
Algorithm The AdaBoost algorithm
n
i
Input: Data set D ={ ( x i ,y i ) }
1
Base learning algorithm L
The number of iterations T
=
D 1 ( x ) =
1 /n
/* initialize the weight distribution */
1:
2: for t
=
1to T do
h t = L (D,
D t )
/* train h t from D with distribution
D t */
3:
t
=
P x D t (h t ( x )
=
y( x ))
/* evaluate the error of h t */
4:
if t > 0 . 5 then
5:
break
6:
end if
2 ln 1 t
7:
1
α t =
/* Determine the weight of h t */
8:
t
exp ( α t )
if h t ( x ) = y( x )
D t + 1 ( x ) = D t ( x )
Z t
×
9:
exp t )
if h t ( x )
=
y( x )
= D t ( x ) exp ( α t f( x )h t ( x ))
Z t
/* update the distribution */
10: end for
Output: H( x ) = sign t = 1 α t h t ( x )
D . Again, suppose
that h 2 is also a weak learner, and the combination of h 1 and h 2 is not good
enough. Boosting will then derive a new distribution D to focus on the mistakes
of the combined classifier of the former base learners. The algorithm is shown
in the following.
It is obvious that the base learners of boosting are dependent, and they are gen-
erated in a sequential way. This is different from parallel ensemble methods such
as Bagging. Parallel ensemble methods exploit the independency of base learners,
while sequential ensemble methods exploit the dependency of base learners.
There are many variants of boosting. The most influential one is AdaBoost
[10], which is the best off-the-shelf learning algorithm in the world. AdaBoost
can be regarded as an instantiation of the general boosting procedure.
on the mistakes of h 1 . Then a base learner h 2 is trained from
4.2.3 Combination Methods
Averaging, voting, and stacking [11-13] are the most popular combination meth-
ods for ensemble. Averaging is commonly used for regression tasks, which
averages the numeric outputs of base learners. Voting is the most popular and
fundamental combination method for classification tasks. Majority voting is the
most common voting method. Each learner votes for one class label, and the
ensemble will output the class label with more than half of the votes; if there is
no class label that receives more than half of the votes, then no prediction will
be given. Unlike majority voting, plurality voting does not need that the output
Search WWH ::




Custom Search