REACT: Revealing Evolutionary Action Consequence Trajectories for Interpretable Reinforcement Learning

Abstract

To enhance the interpretability of Reinforcement Learning (RL), we proposeRevealing Evolutionary Action Consequence Trajectories (REACT). In contrast tothe prevalent practice of validating RL models based on their optimal behaviorlearned during training, we posit that considering a range of edge-casetrajectories provides a more comprehensive understanding of their inherentbehavior. To induce such scenarios, we introduce a disturbance to the initialstate, optimizing it through an evolutionary algorithm to generate a diversepopulation of demonstrations. To evaluate the fitness of trajectories, REACTincorporates a joint fitness function that encourages both local and globaldiversity in the encountered states and chosen actions. Through assessmentswith policies trained for varying durations in discrete and continuousenvironments, we demonstrate the descriptive power of REACT. Our resultshighlight its effectiveness in revealing nuanced aspects of RL models' behaviorbeyond optimal performance, thereby contributing to improved interpretability.

Quick Read (beta)

loading the full paper ...