Reinforcement Learning - Design of Experiments for Reinforcement Learning

Civil Engineering Reference

In-Depth Information

Skoulakis, I. & Lagoudakis, M. (2012). Efficient reinforcement learning in adversarial games. In

Proceedings of the 24th IEEE International Conference on Tools with Artificial Intelligence

(ICTAI), Athens, Greece, 7-9 November (pp. 704-711). doi: 10.1109/ICTAI.2012.100

Smart, W. D. (2002). Making reinforcement learning work on real robots . Unpublished PhD

dissertation, Brown University, Providence, RI.

Smart, W. D. & Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In Pro-

ceedings of the IEEE International Conference on Robotics and Automation (ICRA), Washington,

D.C., 11-15 May (Vol. 4, pp. 3404-3410). doi: 10.1109/ROBOT.2002.1014237

Smith, A. J. (2002). Applications of the self-organising map to reinforcement learning. Neural

Networks , 15(8-9), 1107-1124.

Stanley, K. O. & Miikkulainen, R. (2002). Evolving neural networks through augmenting

topologies. Evolutionary Computation , 10(2), 99-127.

Sutton, R. S. (1984). Temporal credit assignment in reinforcement learning . Unpublished PhD

dissertation, University of Massachusetts, Amherst, MA.

Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse

coarse coding. In Advances in Neural Information Processing Systems 8 (pp. 1038-1044).

Cambridge, MA: MIT Press.

Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning . Cambridge, MA: MIT Press.

Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient method for rein-

forcement learning with function approximation. In Advances in Neural Information Processing

Systems 12 (pp. 1057-1063). Cambridge, MA: MIT Press.

Sutton, R. S., Maei, H. R., Precup, D., Bhatnagar, S., Silver, D., Szepesvári, C., & Wiewiora, E.

(2009a). Fast gradient-descent methods for temporal-difference learning with linear function

approximation. In Proceedings of the 26th International Conference on Machine Learning,

Montreal, Quebec, 14-18 June (pp. 993-1000). doi: 10.1145/1553374.1553501

Sutton, R. S., Szepesvári, C., & Maei, H. R. (2009b). A convergent o ( n ) algorithm for off-

policy temporal-difference learning with linear function approximation. In Advances in Neural

Information Processing Systems 21 (pp. 1609-1616). Cambridge, MA: MIT Press.

Szepesvári, C. (2010). Algorithms for Reinforcement Learning . San Rafael, CA: Morgan &

Claypool.

Tan, A.-H., Lu, N., & Xiao, D. (2008). Integrating temporal difference methods and self-organizing

neural networks for reinforcement learning with delayed evaluative feedback. IEEE Transactions

on Neural Networks , 19(2), 230-244.

Taylor, M. E. & Stone, P. (2009). Transfer learning for reinforcement learning domains: A survey.

Journal of Machine Learning Research , 10(1), 1633-1685.

Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning , 8(3-4),

257-277.

Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM ,

38(3), 58-68.

Tesauro, G., Jong, N. K., Das, R., & Bennani, M. N. (2007). On the use of hybrid reinforcement

learning for autonomic resource allocation. Clustering Computing , 10(3), 287-299.

Thrun, S. (1995). Learning to play the game of Chess. In Advances in Neural Information Processing

Systems 7 (pp. 1069-1076). Cambridge, MA: MIT Press.

Thrun, S. & Schwartz, A. (1993). Issues in using function approximation for reinforcement learning.

In Mozer, M., Smokensky, P., Touretzky, D., Elman, J., & Weigand, A. (Eds.), Proceedings of the

4th Connectionist Models Summer School, Pittsburgh, PA, 2-5 August (pp. 255-263). Hillsdale,

NJ: Lawrence Erlbaum.

Torrey, L. (2009). Relational transfer in reinforcement learning . Unpublished PhD dissertation,

University of Wisconsin, Madison, WI.

Touzet,

C. F. (1997).

Neural reinforcement learning for behaviour synthesis.

Robotics and

Autonomous Systems , 22(3-4), 251-281.

Tsitsiklis, J. N. & Roy, B. V. (1996). Feature-based methods for large scale dynamic programming.

Machine Learning , 22(1-3), 59-94.

Design of Experiments for Reinforcement Learning

Search WWH ::

Custom Search

Home