Demonstration Guided Multi-Objective Reinforcement Learning

Abstract

Multi-objective reinforcement learning (MORL) is increasingly relevant due toits resemblance to real-world scenarios requiring trade-offs between multipleobjectives. Catering to diverse user preferences, traditional reinforcementlearning faces amplified challenges in MORL. To address the difficulty oftraining policies from scratch in MORL, we introduce demonstration-guidedmulti-objective reinforcement learning (DG-MORL). This novel approach utilizesprior demonstrations, aligns them with user preferences via corner weightsupport, and incorporates a self-evolving mechanism to refine suboptimaldemonstrations. Our empirical studies demonstrate DG-MORL's superiority overexisting MORL algorithms, establishing its robustness and efficacy,particularly under challenging conditions. We also provide an upper bound ofthe algorithm's sample complexity.

Quick Read (beta)

loading the full paper ...