Civil Engineering Reference
In-Depth Information
2.2.1.3
Action Space Dimensions
This section describes characteristics that are related to the action space of the domain,
or those that are most closely associated to the actions of the agent.
Action Space Continuity Action space continuity refers to whether the actions
of the agent are discrete or continuous. Often there is some relation between the
continuity of the state space and the action space, where discrete state spaces have
discrete action spaces and continuous state spaces have continuous action spaces, but
this is not always the case. The most common state space-action space pairing is the
discrete state-discrete action paring, such as in most games. Continuous state domains
are often control-type problems or have some underlying real-world dynamics, and
these can use either discrete actions (e.g., mountain car domain) or continuous actions
(e.g. robot control).
Branching Factor The branching factor refers to the number of actions that can be
taken from any state. Some domains have constant branching factors where there
is a constant number of possible actions that can be taken from any and all states.
Other domains have non-constant branching factors where the number of actions
from any state may increase or decrease depending on the state, and the branching
factor therefore takes on a distribution over the state space. The branching factor can
also be thought of as a form of constraint in the sense that the state trajectory of the
agent is somewhat guided or constrained, rather than allowing for the entire state
space to be reached from any other state.
2.2.1.4
Reward Dimension
Reward characteristics are specifically related to the reward function. As mentioned,
while the term reward has a positive connotation, this term is used for any type
of feedback provided to the agent and could therefore be aversive (i.e., negative).
Loosely, a reward is any concrete information that is provided to the agent that is
indicative of the true value or quality of a being in a state or following a trajectory.
Reward Stochasticity Reward stochasticity refers to the whether or not rewards
for any particular state are deterministic or stochastic. More specifically, this char-
acteristic specifies if rewards are provided every time the agent visits a particular
state or if rewards are provided with some probability. The vast majority of domains
use deterministic reward functions where the reward is fixed to a particular state or a
group of states. However, domains may also provide rewards a fraction of the time,
thus providing relatively less feedback to the agent. Note that this characteristics
does not refer to how many states have rewards nor the magnitude of the reward(s),
and these characteristics will be defined next.
Reward Distribution The reward distribution refers to how the rewards are dis-
tributed over the state space, as well as the magnitude of these rewards. Some domains
have a single reward state (e.g., the mountain car problem), whereas other domains
Search WWH ::




Custom Search