Information Technology Reference
In-Depth Information
Algorithm 2. EA(PModel, VP))
// PModel = Player Model, VP = Virtual
Player
1 for i ∈ N popsize 1 do
2 pop i Random-Solution();
3 end for
4 pop popsize VP;
5 i ← 0;
6 while i < MaxGenerations do
7
Rank-Population ( pop ); // sort population according to fitness
parent 1 Select ( pop ); // roulette wheel
8
parent 2 Select ( pop );
9
if Rand[0 , 1] <p X
then // recombination is done
10
( child 1 ,child 2 ) Recombine ( parent 1 , parent 2 );
11
else
12
( child 1 ,child 2 ) ( parent 1 , parent 2 );
13
end if
14
child 1 Mutate ( child 1 ,p M ); // p M
= mutation probability
15
child 2 Mutate ( child 2 ,p M );
16
fitness 1 PlayGameOff( PModel,child 1 );
17
fitness 2 PlayGameOff( PModel,child 2 );
18
pop ← replace( pop, child 1 ,child 2 ); // (popsize + 2) replacement
19
i ← i +1;
20
end while
21
return best solution in pop ;
22
(i.e., in real time) the actions that human player executes, recording additionally
the specific conditions under which these are performed. At the end of this pro-
cess we generate an extended answer matrix ( v + ) that is an answer matrix (i.e.,
an individual encoding v ) where each cell v + [ i ](for0
i<k )representsnow
a vector of 6 positions (one per action) and v + [ i ][ a ](forsomeaction a
[1 , 6])
contains the probability that human player executes action a under the environ-
ment conditions (i.e., the state) associated to cell v [ i ]. Figure 2(right) displays
an example that shows the probability of executing each of the 6 possible actions
in a specific situation of the unit (i.e., the soldier has medium energy, is in an
advantage state, is not suffering an attack, and knows where the opponent flag is
placed). This extended answer matrix is finally used to design the virtual player
as follows: VP[ i ]= argmax a∈ [1 , 6] {
v + [ i ][ a ]
}
, for all possible situations i .
3.4
Off-Line Phase: Evolutionary Optimization
Algorithm 2 shows the basic schema of our EA. The initial population is ran-
domly generated except one individual that is charged with the virtual player
(lines 1-4). Then a classical process of evolution (lines 6-21) is performed and the
best solution in the population is finally returned (draws are broken randomly).
Evaluating the fitness of an individual x requires the simulation (off-line) of
the game between the player model and the virtual player strategy encoded in
x . The fitness function depends on the statistical data collected at the end of
the simulation. The higher the number of statistical data to be collected, the
higher the computational cost will be. A priori, a good policy is to consider
a limited number of data. Four data were used in our experiments during off-
line evaluation: A: Number of deaths in the human player army; B: number of
deaths in virtual player army; C: number of movements; and D: victory degree
Search WWH ::




Custom Search