Civil Engineering Reference
In-Depth Information
ʱ mag
ʱ ratio
ʻ
Convergent
Non−convergent
Convergent
Non−convergent
Convergent
Non−convergent
KS = 0.0734
KS = 0.048
KS = 0.252
0.00100
0.00550
0.01000
2.00
2.75
3.50
4.25
5.00
0.400
0.550
0.700
Parameter value
Parameter value
Parameter value
ʳ
ʵ
Convergent
Non−convergent
Convergent
Non−convergent
KS = 0.211
KS = 0.183
0.960
0.975
0.990
0.85
0.88
0.91
0.94
0.97
Parameter value
Parameter value
Fig. 7.5 Regional sensitivity analysis based on convergence of all experimental runs for the TTBU
problem.
which is essential to solving this problem. Furthermore, we find learning algorithm
parameter ranges that have good performance in being able to control the truck to
the goal location, which is a notable achievement. While we'd like to have a robust
controller that can be used more generally, technically, all that is needed for an
implementation of this is one convergent learning run such as that shown in Fig. 7.4 .
Due to the challenging nature of this problem, and partially due to the basic re-
inforcement learning strategy used, we believe that a sequential learning approach
is essential to have a refined controller. Thus, the work we present consists of per-
haps the first stage of this sequential learning process. Subsequent training could be
used to improve the current controller in a number of ways. The tolerances on the
goal location (and orientation) could be reduced in order to back up the truck to a
more specific location. The ability of the controller could be improved so that it can
generalize to successfully back up the truck from anywhere in the domain. These
approaches could be used in stages or used in an adaptable learning procedure that
refines the goal tolerances or increases the state space coverage based on the current
learning performance. In any case, regardless of the sequential training approach,
in each of these stages the network weights from the previous training stage would
be used as the starting weights for learning in subsequent stages. We note that it
is always possible that a new network with random weights could also be trained
Search WWH ::




Custom Search