Civil Engineering Reference
In-Depth Information
Skoulakis, I. & Lagoudakis, M. (2012). Efficient reinforcement learning in adversarial games. In
Proceedings of the 24th IEEE International Conference on Tools with Artificial Intelligence
(ICTAI), Athens, Greece, 7-9 November
(pp. 704-711). doi: 10.1109/ICTAI.2012.100
Smart, W. D. (2002).
Making reinforcement learning work on real robots
. Unpublished PhD
dissertation, Brown University, Providence, RI.
Smart, W. D. & Kaelbling, L. P. (2002). Effective reinforcement learning for mobile robots. In
Pro-
ceedings of the IEEE International Conference on Robotics and Automation (ICRA), Washington,
D.C., 11-15 May
(Vol. 4, pp. 3404-3410). doi: 10.1109/ROBOT.2002.1014237
Smith, A. J. (2002). Applications of the self-organising map to reinforcement learning.
Neural
Networks
, 15(8-9), 1107-1124.
Stanley, K. O. & Miikkulainen, R. (2002). Evolving neural networks through augmenting
topologies.
Evolutionary Computation
, 10(2), 99-127.
Sutton, R. S. (1984).
Temporal credit assignment in reinforcement learning
. Unpublished PhD
dissertation, University of Massachusetts, Amherst, MA.
Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse
coarse coding. In
Advances in Neural Information Processing Systems 8
(pp. 1038-1044).
Cambridge, MA: MIT Press.
Sutton, R. S. & Barto, A. G. (1998).
Reinforcement Learning
. Cambridge, MA: MIT Press.
Sutton, R. S., McAllester, D., Singh, S., & Mansour, Y. (2000). Policy gradient method for rein-
forcement learning with function approximation. In
Advances in Neural Information Processing
Systems 12
(pp. 1057-1063). Cambridge, MA: MIT Press.
Sutton, R. S., Maei, H. R., Precup, D., Bhatnagar, S., Silver, D., Szepesvári, C., & Wiewiora, E.
(2009a). Fast gradient-descent methods for temporal-difference learning with linear function
approximation. In
Proceedings of the 26th International Conference on Machine Learning,
Montreal, Quebec, 14-18 June
(pp. 993-1000). doi: 10.1145/1553374.1553501
Sutton, R. S., Szepesvári, C., & Maei, H. R. (2009b). A convergent
o
(
n
) algorithm for off-
policy temporal-difference learning with linear function approximation. In
Advances in Neural
Information Processing Systems 21
(pp. 1609-1616). Cambridge, MA: MIT Press.
Szepesvári, C. (2010).
Algorithms for Reinforcement Learning
. San Rafael, CA: Morgan &
Claypool.
Tan, A.-H., Lu, N., & Xiao, D. (2008). Integrating temporal difference methods and self-organizing
neural networks for reinforcement learning with delayed evaluative feedback.
IEEE Transactions
on Neural Networks
, 19(2), 230-244.
Taylor, M. E. & Stone, P. (2009). Transfer learning for reinforcement learning domains: A survey.
Journal of Machine Learning Research
, 10(1), 1633-1685.
Tesauro, G. (1992). Practical issues in temporal difference learning.
Machine Learning
, 8(3-4),
257-277.
Tesauro, G. (1995). Temporal difference learning and TD-Gammon.
Communications of the ACM
,
38(3), 58-68.
Tesauro, G., Jong, N. K., Das, R., & Bennani, M. N. (2007). On the use of hybrid reinforcement
learning for autonomic resource allocation.
Clustering Computing
, 10(3), 287-299.
Thrun, S. (1995). Learning to play the game of Chess. In
Advances in Neural Information Processing
Systems 7
(pp. 1069-1076). Cambridge, MA: MIT Press.
Thrun, S. & Schwartz, A. (1993). Issues in using function approximation for reinforcement learning.
In Mozer, M., Smokensky, P., Touretzky, D., Elman, J., & Weigand, A. (Eds.),
Proceedings of the
4th Connectionist Models Summer School, Pittsburgh, PA, 2-5 August
(pp. 255-263). Hillsdale,
NJ: Lawrence Erlbaum.
Torrey, L. (2009).
Relational transfer in reinforcement learning
. Unpublished PhD dissertation,
University of Wisconsin, Madison, WI.
Touzet,
C. F. (1997).
Neural reinforcement learning for behaviour synthesis.
Robotics and
Autonomous Systems
, 22(3-4), 251-281.
Tsitsiklis, J. N. & Roy, B. V. (1996). Feature-based methods for large scale dynamic programming.
Machine Learning
, 22(1-3), 59-94.
Search WWH ::
Custom Search