GOV-REK: Governed Reward Engineering Kernels for Designing Robust Multi-Agent Reinforcement Learning Systems

Abstract

For multi-agent reinforcement learning systems (MARLS), the problemformulation generally involves investing massive reward engineering effortspecific to a given problem. However, this effort often cannot be translated toother problems; worse, it gets wasted when system dynamics change drastically.This problem is further exacerbated in sparse reward scenarios, where ameaningful heuristic can assist in the policy convergence task. We proposeGOVerned Reward Engineering Kernels (GOV-REK), which dynamically assign rewarddistributions to agents in MARLS during its learning stage. We also introducegovernance kernels, which exploit the underlying structure in either state orjoint action space for assigning meaningful agent reward distributions. Duringthe agent learning stage, it iteratively explores different reward distributionconfigurations with a Hyperband-like algorithm to learn ideal agent rewardmodels in a problem-agnostic manner. Our experiments demonstrate that ourmeaningful reward priors robustly jumpstart the learning process foreffectively learning different MARL problems.

Quick Read (beta)

loading the full paper ...