Information Technology Reference
In-Depth Information
often used when simulating physical and mathematical systems. Because of their
reliance on repeated computation and random or pseudo-random numbers, Monte
Carlo methods are most suited to calculation by a computer. Monte Carlo
methods tend to be used when it is infeasible or impossible to compute an exact
result with a deterministic algorithm. Unlike DP, the Monte Carlo methods do
not assume complete knowledge of the environment. Monte Carlo methods
require only experience--sample sequences of states, actions, and rewards from
on-line or simulated interaction with an environment. Although a model is
required, the model need only generate sample transitions, not the complete
probability distributions of all possible transitions that are required by dynamic
programming (DP) methods. The term Monte Carlo method was coined in the
1940s by physicists working on nuclear weapon projects in the Los Alamos
National Laboratory.
Fig. 10.4. Monte Carlo Methods
Monte Carlo methods are ways of solving the reinforcement learning problem
based on averaging sample returns. There is no single Monte Carlo method;
instead, the term describes a large and widely-used class of approaches. However,
these approaches tend to follow a particular pattern:
(1) Define a domain of possible inputs.
(2) Generate inputs randomly from the domain, and perform a deterministic
computation on them.
(3) Aggregate the results of the individual computations into the final
result.
Search WWH ::




Custom Search