Information Technology Reference
In-Depth Information
Collision Avoidance Using Partially Controlled
Markov Decision Processes
Mykel J. Kochenderfer and James P. Chryssanthacopoulos
Lincoln Laboratory, Massachusetts Institute of Technology
244 Wood Street, MA 02420, Lexington, U.S.A
{ mykelk,chryssanthacopoulos } @ll.mit.edu
Abstract. Optimal collision avoidance in stochastic environments requires ac-
counting for the likelihood and costs of future sequences of outcomes in response
to different sequences of actions. Prior work has investigated formulating the
problem as a Markov decision process, discretizing the state space, and solving
for the optimal strategy using dynamic programming. Experiments have shown
that such an approach can be very effective, but scaling to higher-dimensional
problems can be challenging due to the exponential growth of the discrete state
space. This paper presents an approach that can greatly reduce the complexity
of computing the optimal strategy in problems where only some of the dimen-
sions of the problem are controllable. The approach is applied to aircraft collision
avoidance where the system must recommend maneuvers to an imperfect pilot.
Keywords: Markov
decision
processes,
Dynamic
programming,
Collision
avoidance.
1
Introduction
Manually constructing a robust collision avoidance system, whether it be for an au-
tonomous or human-controlled vehicle, is challenging because the future effects of the
system cannot be known exactly. Due to their safety-critical nature, collision avoidance
systems must maintain a high degree of reliability while minimizing unnecessary path
deviation. Recent work has investigated formulating the problem of collision avoid-
ance as a Markov decision process (MDP) and solving for the optimal strategy using
dynamic programming (DP) [14,15,25]. One limitation of this approach is that the com-
putation and memory requirements grow exponentially with the dimensionality of the
state space. Hence, these studies focused on MDP formulations that capture only a sub-
set of the relevant state variables at the expense of impaired performance.
This paper presents a new approach for significantly reducing the computation and
memory requirements for partially controlled Markov decision processes. The approach
involves decomposing the problem into two separate subproblems, one controlled and
one uncontrolled, that can be solved independently offline using dynamic programming.
This work is sponsored by the Federal Aviation Administration under Air Force Contract
#FA8721-05-C-0002. Opinions, interpretations, conclusions, and recommendations are those
of the authors and are not necessarily endorsed by the United States Government.
 
Search WWH ::




Custom Search