Game-Theoretic Robust Reinforcement Learning Handles Temporally-Coupled Perturbations

Abstract

Deploying reinforcement learning (RL) systems requires robustness touncertainty and model misspecification, yet prior robust RL methods typicallyonly study noise introduced independently across time. However, practicalsources of uncertainty are usually coupled across time. We formally introducetemporally-coupled perturbations, presenting a novel challenge for existingrobust RL methods. To tackle this challenge, we propose GRAD, a novelgame-theoretic approach that treats the temporally-coupled robust RL problem asa partially observable two-player zero-sum game. By finding an approximateequilibrium within this game, GRAD optimizes for general robustness againsttemporally-coupled perturbations. Experiments on continuous control tasksdemonstrate that, compared with prior methods, our approach achieves a higherdegree of robustness to various types of attacks on different attack domains,both in settings with temporally-coupled perturbations and decoupledperturbations.

Quick Read (beta)

loading the full paper ...