Using deep reinforcement learning to promote sustainable human behaviour on a common pool resource problem

Abstract

A canonical social dilemma arises when finite resources are allocated to agroup of people, who can choose to either reciprocate with interest, or keepthe proceeds for themselves. What resource allocation mechanisms will encouragelevels of reciprocation that sustain the commons? Here, in an iteratedmultiplayer trust game, we use deep reinforcement learning (RL) to design anallocation mechanism that endogenously promotes sustainable contributions fromhuman participants to a common pool resource. We first trained neural networksto behave like human players, creating a stimulated economy that allowed us tostudy how different mechanisms influenced the dynamics of receipt andreciprocation. We then used RL to train a social planner to maximise aggregatereturn to players. The social planner discovered a redistributive policy thatled to a large surplus and an inclusive economy, in which players made roughlyequal gains. The RL agent increased human surplus over baseline mechanismsbased on unrestricted welfare or conditional cooperation, by conditioning itsgenerosity on available resources and temporarily sanctioning defectors byallocating fewer resources to them. Examining the AI policy allowed us todevelop an explainable mechanism that performed similarly and was more popularamong players. Deep reinforcement learning can be used to discover mechanismsthat promote sustainable human behaviour.

Quick Read (beta)

loading the full paper ...