Information Technology Reference
In-Depth Information
the listener coincides with the speaker's meaning then a success has happened.
A failure happens when both meanings differ. After a success the correspond-
ing coecients of the association matrices in both robots are increased and the
competing association coecients (i.e. a row for the speaker and a column for
the listener) are updated in the opposite direction. This additional updating is
known as lateral inhibition and it is a key element for the convergence process.
Similarly, the coecients involved in a failure are decreased in both robots.
for k =1 , 2 ,..., max rounds do
Execute all the possible communication acts
Compute the communicative eciency of the robot team EC ( k )
if EC ( k )= Max
in three consecutive rounds then
Break
end if
end for
Fig. 1. Pseudo code of the reinforcement learning-based lexical coordination procedure
Assign randomly the sender/receiver roles
for k =1 , 2 ,..., number of meanings do
Send the meaning m k according to the sender's association matrix
Decode the received symbol s k according to the receiver's association matrix
Update both matrices depending on the communication result
end for
Fig. 2. Pseudo code of a communication act
The ultimate goal is that after the execution of all the language games rounds
the robot team converges to an optimal communication system in which all the
robots use the same optimal permutation matrix (optimal Saussurean solution).
3.2
Algorithms for the Updating of the Association Matrices
We have applied two different algorithms for the updating of the coecients of
the association matrices: (a) an Ant Colony Optimization-based algorithm, or
ACO-like for short, and (b) the incremental algorithm.
In the ACO-like algorithm the coecients of the association matrix are up-
dated as follows:
a ij ( k +1)= ρa ij ( k )+(1
ρ ) β ( k )
1
β ( k )= 1 freward / success
0
0
ρ
(4)
if punish / fail
in which ρ is a critical parameter which has to be carefully selected [1].
 
Search WWH ::




Custom Search