Information Technology Reference
In-Depth Information
Chapter 10
Reinforcement Learning
10.1 Introduction
People often learn by interacting with outside environment. Reinforcement
learning (RL) is a computational approach to the study of learning from
interaction. RL is the learning of a mapping from situations to actions so as to
maximize a scalar reward of reinforcement signal. The learner does not need to
be directly told which actions to take, as in most forms of machine learning, but
instead discover which actions yield the most reward by trying them. In the most
interesting and challenging cases, an action may affect not only the immediate
reward, but also the next situation, and consequently all subsequent rewards.
These two characteristics¾ trial-and-error and delayed reinforcement¾are the
two most important distinguishing characteristics of RL.
Reinforcement learning is not defined by characterizing learning methods, but
by characterizing a learning problem. Any method that is well suited to solving
that problem, we consider to be a reinforcement learning method. RL addresses
the question of how an autonomous agent that senses and acts in its environment
can learn to choose optimal actions to achieve its goals. RL is very different from
supervised learning, the kind of learning studied in almost all current research in
machine learning, statistical pattern recognition, and artificial neural networks.
Supervised learning is learning under the tutelage of a knowledgeable supervisor,
or “teacher”, which explicitly tells the learning agent how it should respond to
training inputs. RL concerns a family of problems in which an agent evolves
while analyzing consequences of its actions, with a simple scalar signal (the
reinforcement) given out by the environment.
The study of RL develops into an unusually multi-disciplinary field; it
includes researches specializing in artificial intelligence, psychology, control
362
Search WWH ::




Custom Search