Civil Engineering Reference
In-Depth Information
Skoulakis, I. & Lagoudakis, M. (2012). Efficient reinforcement learning in adversarial games. In
Proceedings of the 24th IEEE International Conference on Tools with Artificial Intelligence
(ICTAI), Athens, Greece, 7-9 November (pp. 704-711). doi: 10.1109/ICTAI.2012.100
Smart, W. D. (2002). Making reinforcement learning work on real robots . Unpublished PhD
dissertation, Brown University, Providence, RI.
Smart, W. D. & Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In Pro-
ceedings of the IEEE International Conference on Robotics and Automation (ICRA), Washington,
D.C., 11-15 May (Vol. 4, pp. 3404-3410). doi: 10.1109/ROBOT.2002.1014237
Smith, A. J. (2002). Applications of the self-organising map to reinforcement learning. Neural
Networks , 15(8-9), 1107-1124.
Stanley, K. O. & Miikkulainen, R. (2002). Evolving neural networks through augmenting
topologies. Evolutionary Computation , 10(2), 99-127.
Sutton, R. S. (1984). Temporal credit assignment in reinforcement learning . Unpublished PhD
dissertation, University of Massachusetts, Amherst, MA.
Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse
coarse coding. In Advances in Neural Information Processing Systems 8 (pp. 1038-1044).
Cambridge, MA: MIT Press.
Sutton, R. S. & Barto, A. G. (1998). Reinforcement Learning . Cambridge, MA: MIT Press.
Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient method for rein-
forcement learning with function approximation. In Advances in Neural Information Processing
Systems 12 (pp. 1057-1063). Cambridge, MA: MIT Press.
Sutton, R. S., Maei, H. R., Precup, D., Bhatnagar, S., Silver, D., Szepesvári, C., & Wiewiora, E.
(2009a). Fast gradient-descent methods for temporal-difference learning with linear function
approximation. In Proceedings of the 26th International Conference on Machine Learning,
Montreal, Quebec, 14-18 June (pp. 993-1000). doi: 10.1145/1553374.1553501
Sutton, R. S., Szepesvári, C., & Maei, H. R. (2009b). A convergent o ( n ) algorithm for off-
policy temporal-difference learning with linear function approximation. In Advances in Neural
Information Processing Systems 21 (pp. 1609-1616). Cambridge, MA: MIT Press.
Szepesvári, C. (2010). Algorithms for Reinforcement Learning . San Rafael, CA: Morgan &
Claypool.
Tan, A.-H., Lu, N., & Xiao, D. (2008). Integrating temporal difference methods and self-organizing
neural networks for reinforcement learning with delayed evaluative feedback. IEEE Transactions
on Neural Networks , 19(2), 230-244.
Taylor, M. E. & Stone, P. (2009). Transfer learning for reinforcement learning domains: A survey.
Journal of Machine Learning Research , 10(1), 1633-1685.
Tesauro, G. (1992). Practical issues in temporal difference learning. Machine Learning , 8(3-4),
257-277.
Tesauro, G. (1995). Temporal difference learning and TD-Gammon. Communications of the ACM ,
38(3), 58-68.
Tesauro, G., Jong, N. K., Das, R., & Bennani, M. N. (2007). On the use of hybrid reinforcement
learning for autonomic resource allocation. Clustering Computing , 10(3), 287-299.
Thrun, S. (1995). Learning to play the game of Chess. In Advances in Neural Information Processing
Systems 7 (pp. 1069-1076). Cambridge, MA: MIT Press.
Thrun, S. & Schwartz, A. (1993). Issues in using function approximation for reinforcement learning.
In Mozer, M., Smokensky, P., Touretzky, D., Elman, J., & Weigand, A. (Eds.), Proceedings of the
4th Connectionist Models Summer School, Pittsburgh, PA, 2-5 August (pp. 255-263). Hillsdale,
NJ: Lawrence Erlbaum.
Torrey, L. (2009). Relational transfer in reinforcement learning . Unpublished PhD dissertation,
University of Wisconsin, Madison, WI.
Touzet,
C. F. (1997).
Neural reinforcement learning for behaviour synthesis.
Robotics and
Autonomous Systems , 22(3-4), 251-281.
Tsitsiklis, J. N. & Roy, B. V. (1996). Feature-based methods for large scale dynamic programming.
Machine Learning , 22(1-3), 59-94.
Search WWH ::




Custom Search