Civil Engineering Reference
In-Depth Information
Our problem could be considered similar to that of structural reliability, where the
goal is to find the probability of failure of a system or structure which also attempts
to find a boundary (which may be a contour line and not axis-aligned boundaries)
between two regions. In such a problem, there is typically some measurable quan-
titative performance value that characterizes the quality of that under study; even if
this value is poor or somewhat erroneous, there is always a value. In other words,
the code is non-degenerate. However, our problem is different and unique in that
in the cases when reinforcement learning does not converge, there is no measurable
value of performance. Furthermore, the regions we seek are defined by convergence
versus non-convergence, and it is likely that there is not a single value that separates
these regions. Thus, methods that are often used for structural reliability cannot be
directly applied, though similar concepts could be used to develop a procedure that
could help identify convergent subregions.
Another potential use of the sequential CART procedure would be to use it as a
method for screening variables, potentially reducing the number of variables to be
explored in subsequent experimentation. In some of the problems considered here,
some of the convergent parameter subregions extended over nearly the entire original
parameter space for individual parameters and for most of the convergent subregions
found (c.f., ʱ mag for the mountain car problem and for the TTBU problem). In
these cases, a specific parameter range (as a subset of the range explored) may not
be required for convergence. Setting these parameters at their average values would
make any subsequent experimentation easier due to the smaller number of variables.
Ideally, the subregions labeled as convergent from the sequential CART procedure
would have purely convergent runs. Though, due to the random sampling of the
experimental design points and a computational budget, achieving pure subregions
was not possible. Increasing the purity of these subregions would require more design
points and more replicates for each design point. An obvious extension to the current
sequential CART algorithm is to parallelize the experimentation within each iteration
of the algorithm, and this would allow for running more design runs and improving
the accuracy of our results.
8.2.2
Stochastic Kriging
Stochastic kriging is a rather recent extension of deterministic kriging. There are a
number of studies that explore the effects of experimentation and modeling on either
stochastic or deterministic kriging, including the experimental design (Chen and Lin
2013 ), the use of common random numbers in the experimental design (Chen et al.
2012 ), bootstrap model parameter estimation (Kliejnen 2013 ), and using gradient
estimators to improve the metamodeling (Chen et al. 2013 ). However, these studies
focus on low-dimensional (i.e., 1-D or 2-D) benchmark problems, and it is unknown
how these methods extend to high dimensional problems.
Search WWH ::




Custom Search