Reinforcement Learning - Advanced Artificial Intelligence - page 386

Information Technology Reference

In-Depth Information

important directions of RL. At present, the main technical difficulty is: how to

prove and guarantee the convergence of learning algorithm from theoretical

aspects. The development of effective models for complex MDP will also be

important direction in the future.

Exercises

1. Given a brief description for the main branches of reinforcement learning and

its research history.

2. Explain the similarities and differences between reinforcement learning

models and other machine learning methods.

3. Explain the decision process of MDP and its essence.

4. Given the basic ideas of Monte Carlo methods and its applications in

reinforcement learning.

5. Given the basic ideas of Temporal-difference (TD) learning and illustrate its

process considering playing the game of tic-tac-toe.

6. Consider the deterministic grid world shown below with the absorbing

goal-state G. Here the immediate rewards are in the figure for the labeled

transitions and 0 for all unlabeled transitions. Given the

V

*

value for every

Q

(

s

,

a

)

state in this grid world. Given the

value for every transition. Finally,

γ

=

0

show an optimal policy using

.

10

12

14

G

Next Page

Advanced Artificial Intelligence

Search WWH ::

Custom Search

Home