Civil Engineering Reference
In-Depth Information
Domain characteristics
General
characteristics
State space
Action space
Rewards
Stochasticity
Continuity
Continuity
Time horizon
Distribution
Encoding
scheme
Branching
factor
Number of
agents
Stationarity
Complexity
Stationarity
Dimensionality
Observability
Stochasticity
Fig. 2.7 Domains can be characterized by general,
state space,
action space,
and reward
characteristics, each of which can be considered at a finer level.
Time Horizon The time horizon refers to the length of time over which learning
occurs. There are generally two types of horizons in decision processes: finite and
infinite horizons. Finite horizon problems, which are considered in this work and are
also known as episodic problems, have some termination criteria that ends an episode,
which may be based on a finite time limit or, more often, on reaching some absorbing
state. These are the types of problems that classical reinforcement learning considers
almost exclusively. Infinite horizon problems have no absorbing state and instead
the learning and decision making processes extend infinitely in time. These types
of problems are often geared toward real-world or business-like domains, such as
truck routing optimization, and are more associated with the stochastic optimization
or approximate dynamic programming communities (Powell 2007 ).
Number of Agents This characteristic refers to the number of agents that are present
in the environment. If there are multiple agents, the agents also interact with each
other. In many reinforcement learning problems, especially in benchmark problems,
there is a single agent (e.g., Gridworld, mountain car, etc.), though control problems
also often have only a single agent. In other problems, such as in games, there are
often two agents who have opposing goals. In other problems still, there could be
more than two agents that compete for their own goal or a single goal, or there could
be teams of agents that compete for a single goal or opposing goals (Littman 2001 ).
As stated, this dimension is also tied to the number of goals, which will be defined
later. In this work, only single-agent domains are considered.
Domain Stationarity Domain stationarity refers to how any characteristic or
structural property of the domain changes over the course of agent-environment
interactions. A stationary domain has characteristics that does not change, whereas
a non-stationary domain has characteristics that do change with time. It is important
to distinguish stationarity from stochasticity, which is used to describe other domain
Search WWH ::




Custom Search