Biomedical Engineering Reference
In-Depth Information
basic objective of wall seeking (wall-following
generalization track) is the generation of a trajec-
tory parallel to a wall, or eventually an obstacle.
The fitness score used was:
where S j refers to the sensed value for the j th sen-
sor ( 1 j 7 ).
Some fitness functions were normalized ac-
cording to the generation number (Φ 1 and Φ 2 in
[0; 1]), and Φ 3 range was in [-∞,∞].
The above mentioned simple behaviours were
combined for the implementation of the more
complex behaviour of path generation to analyze
whether the robot approaches to a certain source
in a small closed environment avoiding obstacles
(Nolfi & Floreano, 2000) and (Togelius, 2004).
(
) (
)
1
Θ
V
+
1
Θ
V
8
2
if
left
3
(
) (
)
1
Θ
V
+
1
Θ
V
Φ
=
7
5
if
right
3
3
0
otherwise
(3)
Coordination of behavioural Levels
where Θ is the minimum distance allowable
between each sensor and a wall/obstacle. In all
simulations a value of Θ=0.3 was adopted.
The coordination among behaviours was done in
a first approach using another FFNN taking the
outputs of the behavioural modules and sensors
as inputs (Figure 5). Outputs of the coordination
module directly control the actuators. The fitness
score adopted was:
Learning: This behaviour consists of the robot's
approach to one of two possible light sources (tar-
gets) (Nolfi & Floreano, 2000), (Togelius, 2004).
In half of the tests (learning stage), the objective
to be reached varies without any predefined pat-
tern. The robot does not know a-priori which
light source should reach at the beginning of
each test. Therefore, it is expected that the robot
learns to discriminate its objective in a trial and
error paradigm. Reinforcement learning (ϕ) is
accomplished using the following score:
(
)
Φ
= Θ
1
−Θ
(6)
coord
A
B
where Θ is the maximum of all light sensors and
Θ is the maximum of all proximity sensors.
Experimental Results
In a first step, several parameters (shown in Table
1) were determined to do the experiments. After
the learning stage, those neurocontroller that better
support the corresponding behaviour according
to the fitness function, were selected.
As it may be deduced from Table 1, the number
of neurons in each layer were considered fixed a
priori. That is, instead of letting network size to
be adjusted as part of the evolutionary process,
a fixed small size ANN was adopted. This fact
increased the speed in the learning phase and
resulted in an acceptable real-time performance,
both important issues in practical prototyping.
However, from a philosophical standpoint, this is
in fact a human intervention that may constrain
a pure evolutionary development.
if
the
goal
reached
is
right
wrong
=
if
the
goal
reached
is
(4)
0
else
where δ is a default value, being 2 for the pro-
posed experiment. The aim is to maximize the
number of times that the robot reaches the right
objective in an obstacle-free environment. The
fitness function is based on Togelius' proposal
(Togelius, 2003):
n
=
200
n
=
200
if
1
i
i
(5)
Φ
=
i
=
0
i
=
0
4
( )
max
S
else
j
Search WWH ::




Custom Search