Civil Engineering Reference
In-Depth Information
2.2.2.1
Look-up Tables
Look-up tables are the most basic form of representation for reinforcement learning.
These tables store informative (i.e., state values) for each and every state. The states
used in the look-up table may use either the raw state space or derived/hand-coded
features as look-up indices. This type of representation could be considered the most
localized type of representation because of the unique state-value mapping, and thus
state value updates only influence individual states that are visited at a time. State
values of all unvisited states are not affected, which makes this representation immune
to updates of neighboring or nearby states. The properties of look-up tables can be
rigorously analyzed and it can be proved that some learning algorithms converge
under various conditions and assumptions, such as each state being sampled infinitely
many times and the learning rates be set appropriately (Watkins 1989 ; Watkins and
Dayan 1992 ; Jaakkola et al. 2003 ; Tsitsiklis and Roy 1996 ).
Despite the attractive properties of look-up tables, they do have some disadvan-
tages. The primary disadvantage of look-up tables is that they are limited to relatively
small and discrete state spaces due to memory constraints of computers as well as
due to the computationally intensive task of having to search through all entires in
the table (Baird 1995 ; Atkeson et al., 1997 ). In some cases, a simpler, and more com-
pact representation (e.g., neural network) can accurately model the value function
instead, and it can do so with far fewer parameters than unique parameters required
for a look up table (Moody and Tresp 1994 ; Ollington et al. 2009 ). For reinforcement
learning applications to be scaled to real-world problems, more compact represen-
tations are required than look-up tables (Singh et al. 1995 ). However, large state
spaces can be handled by look-up tables using some approximation methods, one of
which includes state aggregation, in which similar states are assumed to have simi-
lar values (Tsitsiklis and Roy 1996 ). Look-up tables can also be use for continuous
state spaces, although states must be discretized in some manner and this may not
be straightforward. Another disadvantage of look-up tables is that learning can be
slow due to the shear number of states and the fact that state values are updated
individually; in other words, look-up tables have no ability to generalize the state
value updates (Tesauro 1992 ; Ollington et al. 2009 ).
Perhaps the earliest use of look-up tables for reinforcement learning might be the
work of Michie and Chambers ( 1968 ) in which 162 'boxes' (i.e., entries in the look-
up tables) are used to approximate the value function for the cart-pole balancing task.
While look-up tables could be viewed as a relatively primitive representation, they
have not been completely replaced by other representations and they are still used in
contemporary research. Whiteson and Stone ( 2006 ) used a look-up table for adaptive
job routing/scheduling, and Nevmyvaka et al. ( 2006 ) used this representation for
optimizing trade execution. A more recent application proved that look-up tables
can converge to the optimal policy for best-match learning, a hybrid sparse model-
based and model-free Q -learning approach (van Seijen et al. 2011 ). Additionally,
Nissen ( 2007 ) used look-up tables for playing blackjack using multiple learning
algorithms including Q ( ʻ ), Sarsa( ʻ ) and Q -Sarsa( ʻ ). Another use of look-up tables
is for comparative purposes for benchmarking other representations. Wiering ( 1995 )
Search WWH ::




Custom Search