On the Power of Global Reward Signals in Reinforcement Learning - Multiagent System Technologies

Information Technology Reference

In-Depth Information

References

1. Bagnell, J.A., Ng, A.Y.: On local rewards and scaling distributed reinforcement

learning. In: Advances in Neural Information Processing Systems, NIPS 2005

(2005)

2. Bernstein, D.S., Givan, R., Immerman, N., Zilberstein, S.: The complexity of de-

centralized control of markov decision processes. Math. Oper. Res. 27, 819-840

(2002)

3. Bernstein, D.S., Hansen, E.A., Zilberstein, S.: Dynamic programming for partially

observable stochastic games. In: AAAI, pp. 709-715. AAAI Press / The MIT Press

(2004)

4. Busoniu, L., Babuska, R., De Schutter, B.: Multi-agent reinforcement learning: An

overview. In: Srinivasan, D., Jain, L.C. (eds.) Innovations in Multi-Agent Systems

and Applications - 1. SCI, vol. 310, pp. 183-221. Springer, Heidelberg (2010)

5. Chang, Y.H., Ho, T., Kaelbling, L.P.: All learning is local: Multi-agent learning in

global reward games. In: Thrun, S., Saul, L.K., Sch¨olkopf,B.(eds.)NIPS.MIT

Press, Cambridge (2003)

6. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms,

2nd edn. MIT Press, Cambridge (2001)

7. Devlin, S., Kudenko, D.: Theoretical considerations of potential-based reward shap-

ing for multi-agent systems. In: Proc. of 10th Intl. Conf. on Autonomous Agents

and Multiagent Systems (AAMAS 2011), pp. 225-232 (2011)

8. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory

of NP-Completeness. W. H. Freeman and Company, New York (1979)

9. Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially

observable stochastic domains. J. Artif. Intell. Res. 101(1-2), 99-134 (1998)

10. Kaelbling, L.P., Littman, M.L., Moore, A.P.: Reinforcement learning: A survey. J.

Artif. Intell. Res. 4, 237-285 (1996)

11. Kemmerich, T., Kleine Buning, H.: A convergent multiagent reinforcement learning

approach for a subclass of cooperative stochastic games. In: Proc. of the Adaptive

Learning Agents Workshop @ AAMAS 2011, pp. 75-82 (2011)

12. Kemmerich, T., Kleine Buning, H.: Region-based heuristics for an iterative par-

titioning problem in multiagent systems. In: Proc. 3rd Intl. Conf. on Agents and

Artificial Intelligence (ICAART 2011), vol. 2, pp. 200-205. SciTePress (2011)

13. Melo, F.S., Ribeiro, I.: Transition entropy in partially observable markov decision

processes. In: Arai, T., Pfeifer, R., Balch, T.R., Yokoi, H. (eds.) IAS, pp. 282-289.

IOS Press, Amsterdam (2006)

14. Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997)

15. Ng, A.Y., Harada, D., Russell, S.J.: Policy invariance under reward transforma-

tions: Theory and application to reward shaping. In: Bratko, I., Dzeroski, S. (eds.)

ICML, pp. 278-287. Morgan Kaufmann, San Francisco (1999)

16. Oliehoek, F.A., Spaan, M.T.J., Vlassis, N.A.: Optimal and approximate Q-value

functions for decentralized POMDPs. J. Artif. Intell. Res. 32, 289-353 (2008)

17. Seuken, S., Zilberstein, S.: Formal models and algorithms for decentralized decision

making under uncertainty. Autonomous Agents and Multi-Agent Systems 17(2),

190-250 (2008)

18. Stone, P., Sutton, R.S., Kuhlmann, G.: Reinforcement learning for robocup-soccer

keepaway. Adaptive Behavior 13(3), 165-188 (2005)

19. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. The MIT

Press, Cambridge (1998)

Multiagent System Technologies

Search WWH ::

Custom Search

Home