Introduction - Design of Experiments for Reinforcement Learning

Civil Engineering Reference

In-Depth Information

In Chaps. 5-7, we explore the mountain car problem, the single trailer truck

backer-upper problem, and the tandem trailer truck backer-upper problem, respec-

tively. We conclude this work with a discussion of our findings, our innovations, and

possible directions for future work in Chap. 8. Appendices A and B present founda-

tional work in exploring the effects of parameters in reinforcement learning applied

a two player board game and a benchmark control problem, from which this work

grew.

References

Dann, C., Neumann, G., & Peters, J. (2014). Policy evaluation with temporal differences: A survey

and comparison. Journal of Machine Learning Research , 15(1), 809-883.

Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2011a). Parameter settings of reinforcement learning

for the game of Chung Toi. In Proceedings of the 2011 IEEE International Conference on

Systems, Man, and Cybernetics (SMC 2011), Anchorage, AK, 9-12 October (pp. 3530-3535).

doi: 10.1109/ICSMC.2011.6084216

Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2013). An empirical analysis of reinforcement

learning using design of experiments. In Proceedings of the 21st European Symposium on Arti-

ficial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges,

Belgium, 24-26 April (pp. 221-226). Bruges, Belgium: ESANN.

Kalyanakrishnan, S. & Stone, P. (2009). An empirical analysis of value function-based and policy

search reinforcement learning. In Proceedings of the 8th International Conference on Au-

tonomous Agents and Multiagent Systems (AAMAS '09), Budapest, Hungary, 10-15 May (Vol. 2,

pp. 749-756). Richland, SC: International Foundation for Autonomous Agents and Multiagent

Systems.

Kalyanakrishnan, S. & Stone, P. (2011). Characterizing reinforcement learning methods through

parameterized learning problems. Machine Learning , 84(1-2), 205-247.

Mahadevan, S. & Theocharous, G. (1998). Optimizing production manufacturing using reinforce-

ment learning. In Cook, D. J. (Ed.) Proceedings of the 11th International Florida Artificial

Intelligence Research Society Conference, Sanibel Island, Florida, 18-20 May (pp. 372-377).

AAAI Press.

Ng, A. Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E. & Liang, E. (2004).

Autonomous inverted helicopter flight via reinforcement learning. In International Symposium

on Experimental Robotics (ISER-2004), Singapore, 18-21 June (pp. 363-372). Cambridge,

MA: MIT Press.

Silver, D., Sutton, R. S., & Müller, M. (2012). Temporal-difference search in computer Go. Machine

Learning , 87(2), 183-219.

Smart, W. D. & Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In Pro-

ceedings of the IEEE International Conference on Robotics and Automation (ICRA), Washington,

D.C., 11-15 May (Vol. 4, pp. 3404-3410). doi: 10.1109/ROBOT.2002.1014237

Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning . Cambridge, MA: MIT Press.

Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning , 8(3-4),

257-277.

Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM ,

38(3), 58-68.

van Eck, N. J. & van Wezel, M. (2008). Application of reinforcement learning to the game of

othello. Computers & Operations Research , 35(6), 1999-2017.

Veness, J., Silver, D., Uther, W., & Blair, A. (2009). Bootstrapping from game tree search. In Bengio,

Y., Schuurmans, D., Lafferty, J. D., Williams, C. K. I., & Culotta, A. (Eds.), Advances in Neural

Information Processing Systems 22 (pp. 1937-1945). Red Hook, NY: Curran Associates, Inc.

Design of Experiments for Reinforcement Learning

Search WWH ::

Custom Search

Home