Civil Engineering Reference
In-Depth Information
In Chaps. 5-7, we explore the mountain car problem, the single trailer truck
backer-upper problem, and the tandem trailer truck backer-upper problem, respec-
tively. We conclude this work with a discussion of our findings, our innovations, and
possible directions for future work in Chap. 8. Appendices A and B present founda-
tional work in exploring the effects of parameters in reinforcement learning applied
a two player board game and a benchmark control problem, from which this work
grew.
References
Dann, C., Neumann, G., & Peters, J. (2014). Policy evaluation with temporal differences: A survey
and comparison. Journal of Machine Learning Research , 15(1), 809-883.
Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2011a). Parameter settings of reinforcement learning
for the game of Chung Toi. In Proceedings of the 2011 IEEE International Conference on
Systems, Man, and Cybernetics (SMC 2011), Anchorage, AK, 9-12 October (pp. 3530-3535).
doi: 10.1109/ICSMC.2011.6084216
Gatti, C. J., Embrechts, M. J., & Linton, J. D. (2013). An empirical analysis of reinforcement
learning using design of experiments. In Proceedings of the 21st European Symposium on Arti-
ficial Neural Networks, Computational Intelligence and Machine Learning (ESANN), Bruges,
Belgium, 24-26 April (pp. 221-226). Bruges, Belgium: ESANN.
Kalyanakrishnan, S. & Stone, P. (2009). An empirical analysis of value function-based and policy
search reinforcement learning. In Proceedings of the 8th International Conference on Au-
tonomous Agents and Multiagent Systems (AAMAS '09), Budapest, Hungary, 10-15 May (Vol. 2,
pp. 749-756). Richland, SC: International Foundation for Autonomous Agents and Multiagent
Systems.
Kalyanakrishnan, S. & Stone, P. (2011). Characterizing reinforcement learning methods through
parameterized learning problems. Machine Learning , 84(1-2), 205-247.
Mahadevan, S. & Theocharous, G. (1998). Optimizing production manufacturing using reinforce-
ment learning. In Cook, D. J. (Ed.) Proceedings of the 11th International Florida Artificial
Intelligence Research Society Conference, Sanibel Island, Florida, 18-20 May (pp. 372-377).
AAAI Press.
Ng, A. Y., Coates, A., Diel, M., Ganapathi, V., Schulte, J., Tse, B., Berger, E. & Liang, E. (2004).
Autonomous inverted helicopter flight via reinforcement learning. In International Symposium
on Experimental Robotics (ISER-2004), Singapore, 18-21 June (pp. 363-372). Cambridge,
MA: MIT Press.
Silver, D., Sutton, R. S., & Müller, M. (2012). Temporal-difference search in computer Go. Machine
Learning , 87(2), 183-219.
Smart, W. D. & Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In Pro-
ceedings of the IEEE International Conference on Robotics and Automation (ICRA), Washington,
D.C., 11-15 May (Vol. 4, pp. 3404-3410). doi: 10.1109/ROBOT.2002.1014237
Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning . Cambridge, MA: MIT Press.
Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning , 8(3-4),
257-277.
Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM ,
38(3), 58-68.
van Eck, N. J. & van Wezel, M. (2008). Application of reinforcement learning to the game of
othello. Computers & Operations Research , 35(6), 1999-2017.
Veness, J., Silver, D., Uther, W., & Blair, A. (2009). Bootstrapping from game tree search. In Bengio,
Y., Schuurmans, D., Lafferty, J. D., Williams, C. K. I., & Culotta, A. (Eds.), Advances in Neural
Information Processing Systems 22 (pp. 1937-1945). Red Hook, NY: Curran Associates, Inc.
Search WWH ::




Custom Search