Evolutionary Computation - Advanced Artificial Intelligence

Information Technology Reference

In-Depth Information

(13.29)

The difference between current intensity and equilibrium intensity is reduced

by factor

every time the intensity is changed. For example, assume that

* =

500 b

= 0.1 and

S i (

)= 100, then

S i (

+1) = 100- 10 + 50 = 140. Note that the

error

S i is reduced from 400 to 360; that is, this error is reduced by 10%.

We can find that under the case of constant reward, each rule's intensity can

quickly converge to equilibrium intensity, and reward can be evaluated when the

plot is over. A possible restriction of PSP is that credit must be assigned in the

interval corresponding to the plot differentiated by exterior reward. It is very

important to select such a plot.

Suppose that rule

R i is ignited at step τ while rule

R j at step τ+ 1. Then BBA

uses the following formula to modify the intensity

S i of rule

R i :

j (τ) (13.30)

Except that plot index t is replaced with step index τ and exterior reward p ( t )

is replaced with the intensity

S i (τ+1) =

S i (τ) -

bS i (τ) +

R j , this formula is the same as (13.26). The

first change means that the number of modifying rule's intensity in a given plot is

larger than one. The second modification leads to the basic difference between

PSP and BBA. Consider two pieces of rule

S j of rule

R i and

R j . Rule

R i

is ignited after rule

R j . Assume that

i and

j are ignited in a plot no more than one time, then we

have:

Ã ( i - b ) t-i

) t

] =

= lim t ŗ¯ [(1-

i (0)+

i =

Ã b (1- b ) t-i S j ( i -1)

) t

S i (

) = (1-

i (0) +

(13.31)

i =

where the range of

is the whole plot and the two activity. In other words, the

intensity of

i follows that of

j . If

S j can converge to a constant

S j *, then

S i can

also converge.

Ã (1- b ) t-i S j ( i -1)] = S j *

i * = lim t ŗ¯

) t

(13.32)

i (

) = lim t ŗ¯ [(1-

S i (0)+

i =

Similarly, formula (13.29) shows that

S j can converge to

S j * (the internal

payoff for

R i ). This kind of analysis can be extended to any rule chain. For

example, when <

R 1 R 2 … R n > are ignited in turn, only rule

R n can receive

Advanced Artificial Intelligence

Search WWH ::

Custom Search

Home