Abstractions for devising compact controllers for MDPs

Kolman Vornovitsky, M.Sc. Thesis Seminar
Wednesday, 17.2.2010, 15:30
Taub 601
Dr. Carmel Domshlak

Planning a course of action is a key ability for intelligent systems. It involves the representation of actions and world models, reasoning about the effects of actions, and techniques for efficiently searching the space of possible plans. Planning under uncertainty is captured by the area of decision-theoretic planning (DTP). In such problems, the actions have stochastic effects, and the goal is to devise a policy of acting with a high expected utility, as opposed to deterministic planning which looks for sequence of actions that translates an agent from an initial state to a goal state. The most classical mathematical framework in use of decision-theoretic planning is this of Markov decision processes (MDP). One of the most interesting insights emerging from AI planning, reasoning under uncertainty, decision analysis and OR is that many DTP problems exhibit considerable structure. In particular, the use of variable-based representations to describe problems, as is the typical practice in planning. However most of the solutions for MDP's, i.e. algorithms which compute the optimal policy, assume an explicit state space. We are using a technique known in deterministic planning as over-approximating abstractions to approximately solve the exponential state space MDP problem. Very roughly, an abstraction is a mapping that reduces the size of the state space by contracting several states into one. By making the abstract space small enough, it becomes feasible to perform the standard solutions for explicit state space MDP's on the abstract state space. The ``merge-and-shrink'' methodology introduced by Drager, Finkbeiner & Podelski was adopted as over-approximations abstraction in the context of deterministic planning by Helmert, Haslum & Hoffmann. We adapt the latter technique for devising compact controllers for MDP's. Additionally we extend the ``merge-and-shrink'' both for MDP's and for deterministic planning with action abstraction techniques. Finally, we provide a clear testbed evaluation for our methods, and compare their effectiveness with other state-of-the-art approaches.

Back to the index of events