Evolutionary Reinforcement Learning via Cooperative Coevolution

Abstract

Recently, evolutionary reinforcement learning has obtained much attention invarious domains. Maintaining a population of actors, evolutionary reinforcementlearning utilises the collected experiences to improve the behaviour policythrough efficient exploration. However, the poor scalability of geneticoperators limits the efficiency of optimising high-dimensional neural networks.To address this issue, this paper proposes a novel cooperative coevolutionaryreinforcement learning (CoERL) algorithm. Inspired by cooperative coevolution,CoERL periodically and adaptively decomposes the policy optimisation probleminto multiple subproblems and evolves a population of neural networks for eachof the subproblems. Instead of using genetic operators, CoERL directly searchesfor partial gradients to update the policy. Updating policy with partialgradients maintains consistency between the behaviour spaces of parents andoffspring across generations. The experiences collected by the population arethen used to improve the entire policy, which enhances the sampling efficiency.Experiments on six benchmark locomotion tasks demonstrate that CoERLoutperforms seven state-of-the-art algorithms and baselines. Ablation studyverifies the unique contribution of CoERL's core ingredients.

Quick Read (beta)

loading the full paper ...