Events

The Taub Faculty of Computer Science Events and Talks

Counterfactual and Robustness-Based Explanations for Reinforcement Learning Policies

Andrew Elashkin (M.Sc. Thesis Seminar)

Sunday, 18.05.2025, 13:30

Taub 301 & Zoom

Advisor: Prof. Orna Grumberg

Reinforcement learning policies in Markov decision processes (MDPs) often behave unexpectedly, especially in environments with sparse rewards, raising challenges for debugging and verification. We propose a general framework for discrete MDPs to generate two complementary, one-step explanations for single-action anomalies: (1) minimal counterfactual states—the smallest factored-state perturbations that flip a chosen action—and (2) robustness regions—contiguous state neighborhoods over which the original action remains invariant.

Without accessing internal model details, our black-box technique uses only action feedback, is applicable to any discrete RL setting, and has been validated on various Gymnasium environments, providing actionable understanding of when and why policies change their decisions.

[Back to the index of events]