Design of Experiments - Design of Experiments for Reinforcement Learning

Civil Engineering Reference

In-Depth Information

experimental runs are analyzed, and these points can be indicative of factor sensi-

tivity (Kleijnen 2009 ). More formal spatial statistics methods can also be useful in

understanding parameter effects by using variograms and its relatives (Journel and

Huijbregts 1978 ; Cressie 1993 ).

3.3

Design of Experiments for Empirical Algorithm Analysis

Empirical analyses of machine learning algorithms in general is a relatively standard

approach for benchmarking and testing algorithms. The case is no different with

respect to reinforcement learning, where there are countless examples in which

authors present figures showing the performance of the agent over the course of

learning. In these cases, when multiple learning algorithms are evaluated or when

parameters of a single learning algorithm are varied, multiple learning performance

curves are presented in order to give the reader a sense of the relative speed of learning

and the maximal performance for each algorithm or parameter variation.

The literature is relative scarce with respect to the use of a design of experiments

approach to evaluating learning algorithms (not necessarily reinforcement learning

algorithms) or heuristics empirically, though this scarcity and lack of rigor have been

acknowledged (Hooker 1995 ; Eiben and Jelasity 2002 ). Parsons and Johnson ( 1997 )

use response surface methods with central composite and fraction factorial designs

to improve the performance of genetic algorithms for DNA sequence assembly. Park

and Kim ( 1998 ) use a non-linear response surface method to select parameters for

simulating annealing and show the effectiveness of this approach in graph partition-

ing problems, flowshop scheduling problems, and production scheduling problems.

Coy et al. ( 2000 ) use a design of experiments approach with response surface meth-

ods to find optimal parameters for heuristics that are commonly used in vehicle

routing problems. Shilane et al. ( 2008 ) develop an approach to statistically compare

evolutionary algorithms.

Perhaps the largest series of work in this area belongs to Ridge and Kudenko ( 2006 ,

2007a ,b,c, 2008 ) with their work investigating the ant colony optimization (ACO)

algorithm. Ridge and Kudenko ( 2006 ) thoroughly outlines a potential design of

experiments approach for studying the ACO algorithm, including the use of screening

experiments, de-aliasing effects, and response surface methods. Their work details a

sequential experimentation procedure based on a screening experiment and response

surface methods to gain an initial understanding of algorithm parameters. Similar

work was applied to studying the ACO algorithm by using a fractional factorial

design to understand the effects of 12 parameters (Ridge and Kudenko 2007b ), and

by using response surface methods to optimize algorithm parameters (Ridge and

Kudenko 2007a , c). Ridge and Kudenko ( 2008 ) again investigated the effects of

ACO parameters, this time using a hierarchical (nested) design that was analyzed

using a general linear model with fixed and random effects.

Search WWH ::

Custom Search

Home