Collective Classification for Spam Filtering - Computational Intelligence in Security for Information Systems

Information Technology Reference

In-Depth Information

Algorithm 1. Greedy-Boost algorithm

S =( x 1 ,y 1 ) , ....., ( x n ,y n ) is a set of labeled instances (connections)

for i =1 to n do

Initialize p 0 ( x i )=1 /n

end for

for t =1 to T do

Choose S t from S using bootstrap sampling.

Construct model h t for S t using weal learner A

Calculate t the error rate of h t

Calculate score of classifier α t =1 / 2 ln ((1 − t ) / t ).

for j =1 to m do

P t +1 ( x j ) ← ( p 0 ( x j ) /Z t ) e − α t

If argmax ( i =1 α i h i ( x j )) = y j

(the connection is well classified)

P t +1 ( x j ) ← ( p 0 ( x j ) /Z t ) e + α t

If argmax ( i =1 α i h i ( x j )) = y j (The connection is badly classified)

Use Z t to normalize with j =1 p t ( x j )=1)

end for

return

t =1 α t

H ( x )= argmax y ∈ Y

the distribution based on the initial distribution instead of the previous one (all

examples with the same weight). In other words, we evaluate the examples badly

classified by the standard model. This produces a new sample which will be used

to generate a new model. The interest of such a process is the fact that it nec-

essarily converges towards a stable solution under the hypotheses that on each

iteration, we are able to find a weak learner, i.e, with an error t < 1 / 2onthe

weighted sample.

Theorem. Based on the hypotheses that for each iteration, the learner construct

by Greedy-Boost is at least a weak learner, i.e,

∀

t , t

< 1 / 2 , Greedy-boost con-

verges necessarily towards a stable solution when T

→∞

Proof : The proof is based on a theorem of [12] which ensures that in iteration

t +1, the error on the learning sample of the combined classifier H t +1 of AdaBoost

is lower than the one of H t , the combined classifier of iteration t [12]:

err ( H t +1 ) <err ( H t )

−

4(1

−

t ) 2

(1)

err(.) is the empirical error. With t < 1 / 2, we have 1

t ) 2 < 1so

err ( H t +1 ) <err ( H t ). We consider that the solution of Greedy-Boost is equiva-

lent to the solution of AdaBoost in the two fist iterations H GLO

−

4(1

−

1 ( x )= H AD 1 ( x ).

After, for each iteration, we change the distribution P t with the same way of

AdaBoost, based on the linear combination of models previously generated. This

process is equivalent to applying AdaBoost on the current classifier H GRE

.So

× 1

based on (1), we have H GRE

<H GRE

−

4(1

−

t ) 2 . Consequently, in iter-

ation T ,wehave:

Computational Intelligence in Security for Information Systems

Search WWH ::

Custom Search

Home