Information Technology Reference
In-Depth Information
References
1. Bagnell, J.A., Ng, A.Y.: On local rewards and scaling distributed reinforcement
learning. In: Advances in Neural Information Processing Systems, NIPS 2005
(2005)
2. Bernstein, D.S., Givan, R., Immerman, N., Zilberstein, S.: The complexity of de-
centralized control of markov decision processes. Math. Oper. Res. 27, 819-840
(2002)
3. Bernstein, D.S., Hansen, E.A., Zilberstein, S.: Dynamic programming for partially
observable stochastic games. In: AAAI, pp. 709-715. AAAI Press / The MIT Press
(2004)
4. Busoniu, L., Babuska, R., De Schutter, B.: Multi-agent reinforcement learning: An
overview. In: Srinivasan, D., Jain, L.C. (eds.) Innovations in Multi-Agent Systems
and Applications - 1. SCI, vol. 310, pp. 183-221. Springer, Heidelberg (2010)
5. Chang, Y.H., Ho, T., Kaelbling, L.P.: All learning is local: Multi-agent learning in
global reward games. In: Thrun, S., Saul, L.K., Sch¨olkopf,B.(eds.)NIPS.MIT
Press, Cambridge (2003)
6. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms,
2nd edn. MIT Press, Cambridge (2001)
7. Devlin, S., Kudenko, D.: Theoretical considerations of potential-based reward shap-
ing for multi-agent systems. In: Proc. of 10th Intl. Conf. on Autonomous Agents
and Multiagent Systems (AAMAS 2011), pp. 225-232 (2011)
8. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory
of NP-Completeness. W. H. Freeman and Company, New York (1979)
9. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially
observable stochastic domains. J. Artif. Intell. Res. 101(1-2), 99-134 (1998)
10. Kaelbling, L.P., Littman, M.L., Moore, A.P.: Reinforcement learning: A survey. J.
Artif. Intell. Res. 4, 237-285 (1996)
11. Kemmerich, T., Kleine Buning, H.: A convergent multiagent reinforcement learning
approach for a subclass of cooperative stochastic games. In: Proc. of the Adaptive
Learning Agents Workshop @ AAMAS 2011, pp. 75-82 (2011)
12. Kemmerich, T., Kleine Buning, H.: Region-based heuristics for an iterative par-
titioning problem in multiagent systems. In: Proc. 3rd Intl. Conf. on Agents and
Artificial Intelligence (ICAART 2011), vol. 2, pp. 200-205. SciTePress (2011)
13. Melo, F.S., Ribeiro, I.: Transition entropy in partially observable markov decision
processes. In: Arai, T., Pfeifer, R., Balch, T.R., Yokoi, H. (eds.) IAS, pp. 282-289.
IOS Press, Amsterdam (2006)
14. Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)
15. Ng, A.Y., Harada, D., Russell, S.J.: Policy invariance under reward transforma-
tions: Theory and application to reward shaping. In: Bratko, I., Dzeroski, S. (eds.)
ICML, pp. 278-287. Morgan Kaufmann, San Francisco (1999)
16. Oliehoek, F.A., Spaan, M.T.J., Vlassis, N.A.: Optimal and approximate Q-value
functions for decentralized POMDPs. J. Artif. Intell. Res. 32, 289-353 (2008)
17. Seuken, S., Zilberstein, S.: Formal models and algorithms for decentralized decision
making under uncertainty. Autonomous Agents and Multi-Agent Systems 17(2),
190-250 (2008)
18. Stone, P., Sutton, R.S., Kuhlmann, G.: Reinforcement learning for robocup-soccer
keepaway. Adaptive Behavior 13(3), 165-188 (2005)
19. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT
Press, Cambridge (1998)
 
Search WWH ::




Custom Search