A Biologically Inspired Autonomous Robot Control Based on Behavioural Coordination in Evolutionary Robotics - Advancing Artificial Intelligence Through Biological Process Applications

Biomedical Engineering Reference

In-Depth Information

basic objective of wall seeking (wall-following

generalization track) is the generation of a trajec-

tory parallel to a wall, or eventually an obstacle.

The fitness score used was:

where S j refers to the sensed value for the j th sen-

sor ( 1 ≤ j ≤ 7 ).

Some fitness functions were normalized ac-

cording to the generation number (Φ 1 and Φ 2 in

[0; 1]), and Φ 3 range was in [-∞,∞].

The above mentioned simple behaviours were

combined for the implementation of the more

complex behaviour of path generation to analyze

whether the robot approaches to a certain source

in a small closed environment avoiding obstacles

(Nolfi & Floreano, 2000) and (Togelius, 2004).

(

) (

)



−

left



(

) (

)



−

right





otherwise





(3)

Coordination of behavioural Levels

where Θ is the minimum distance allowable

between each sensor and a wall/obstacle. In all

simulations a value of Θ=0.3 was adopted.

The coordination among behaviours was done in

a first approach using another FFNN taking the

outputs of the behavioural modules and sensors

as inputs (Figure 5). Outputs of the coordination

module directly control the actuators. The fitness

score adopted was:

Learning: This behaviour consists of the robot's

approach to one of two possible light sources (tar-

gets) (Nolfi & Floreano, 2000), (Togelius, 2004).

In half of the tests (learning stage), the objective

to be reached varies without any predefined pat-

tern. The robot does not know a-priori which

light source should reach at the beginning of

each test. Therefore, it is expected that the robot

learns to discriminate its objective in a trial and

error paradigm. Reinforcement learning (ϕ) is

accomplished using the following score:

(

)

= Θ

−Θ

(6)

coord

where Θ is the maximum of all light sensors and

Θ is the maximum of all proximity sensors.

Experimental Results

In a first step, several parameters (shown in Table

1) were determined to do the experiments. After

the learning stage, those neurocontroller that better

support the corresponding behaviour according

to the fitness function, were selected.

As it may be deduced from Table 1, the number

of neurons in each layer were considered fixed a

priori. That is, instead of letting network size to

be adjusted as part of the evolutionary process,

a fixed small size ANN was adopted. This fact

increased the speed in the learning phase and

resulted in an acceptable real-time performance,

both important issues in practical prototyping.

However, from a philosophical standpoint, this is

in fact a human intervention that may constrain

a pure evolutionary development.

the

goal

reached

right





wrong

−

the

goal

reached

(4)



else

where δ is a default value, being 2 for the pro-

posed experiment. The aim is to maximize the

number of times that the robot reaches the right

objective in an obstacle-free environment. The

fitness function is based on Togelius' proposal

(Togelius, 2003):



200

∑

≥





(5)



( )

max

else



Advancing Artificial Intelligence Through Biological Process Applications

Search WWH ::

Custom Search

Home