Information Technology Reference
In-Depth Information
values (i.e., T tax , T maintain , T cele ,and T human ). These four default values are
the desired outcomes. Note that a player does not know these four values. We
can compute the achievement level for each outcome for the player as follows:
a level =
( d j
o j ) ,
(5)
where d j is the desired value of the outcome and o j is the actual outcome.
As soon as we get the achievement level, we apply back propagation to evolve
the neural network.
Let K ( L ) be the number of neurons in the L -th layer, i.e., the last layer. The
total error J is define as:
K ( L )
J = 1
2
e j ,
(6)
j =1
where e is:
e j = d j
o j .
(7)
o j is the output of the j -th neuron. To update the hidden layer, we have
K ( r +1)
∂J
∂o ( r )
j
∂J
∂o ( r +1)
k
f ( s ( r +1)
k
w ( r +1)
kj
=
)
,
(8)
k =1
K ( r +1)
∂J
∂o ( r +1)
k
ʔw ( r )
ji
f ( s ( r +1)
k
w ( r +1)
kj
f ( s ( r )
j
o ( r− 1)
i
=
)
)
(9)
k =1
where w ( r )
ji is the change of the weight from i -th neuron to the j -th neuron of
the r -th layer. o ( r +1)
k is the output of the k -thneuroninthe( r + 1)-th layer.
For the last layer, i.e., L -th layer, we have
∂J
∂o ( L )
k
=
( d k
o k ) .
(10)
The weight changes are computed as follows:
∂J
∂o ( L )
k
ʔw ( L )
kj
f ( s ( L )
k
o ( L− 1)
j
=
)
.
(11)
Wecanusethefollowingformulatoupdatealltheweights:
w ( t +1)= w ( t )
ʷ
ʔ w ,
Search WWH ::




Custom Search