Verco: Learning Coordinated Verbal Communication for Multi-agent Reinforcement Learning

Abstract

In recent years, multi-agent reinforcement learning algorithms have madesignificant advancements in diverse gaming environments, leading to increasedinterest in the broader application of such techniques. To address theprevalent challenge of partial observability, communication-based algorithmshave improved cooperative performance through the sharing of numericalembedding between agents. However, the understanding of the formation ofcollaborative mechanisms is still very limited, making designing ahuman-understandable communication mechanism a valuable problem to address. Inthis paper, we propose a novel multi-agent reinforcement learning algorithmthat embeds large language models into agents, endowing them with the abilityto generate human-understandable verbal communication. The entire framework hasa message module and an action module. The message module is responsible forgenerating and sending verbal messages to other agents, effectively enhancinginformation sharing among agents. To further enhance the message module, weemploy a teacher model to generate message labels from the global view andupdate the student model through Supervised Fine-Tuning (SFT). The actionmodule receives messages from other agents and selects actions based on currentlocal observations and received messages. Experiments conducted on theOvercooked game demonstrate our method significantly enhances the learningefficiency and performance of existing methods, while also providing aninterpretable tool for humans to understand the process of multi-agentcooperation.

Quick Read (beta)

loading the full paper ...