ReZero: Boosting MCTS-based Algorithms by Just-in-Time and Speedy Reanalyze

  • 2024-04-25 08:02:07
  • Chunyu Xuan, Yazhe Niu, Yuan Pu, Shuai Hu, Jing Yang
  • 0

Abstract

MCTS-based algorithms, such as MuZero and its derivatives, have achievedwidespread success in various decision-making domains. These algorithms employthe reanalyze process to enhance sample efficiency, albeit at the expense ofsignificant wall-clock time consumption. To address this issue, we propose ageneral approach named ReZero to boost MCTS-based algorithms. Specifically, wepropose a new scheme that simplifies data collecting and reanalyzing, whichsignificantly reduces the search cost while guarantees the performance as well.Furthermore, to accelerate each search process, we conceive a method to reusethe subsequent information in the trajectory. The corresponding analysisconducted on the bandit model also provides auxiliary theoreticalsubstantiation for our design. Experiments conducted on Atari environments andboard games demonstrates that ReZero substantially improves training speedwhile maintaining high sample efficiency. The code is available as part of theLightZero benchmark at https://github.com/opendilab/LightZero.

 

Quick Read (beta)

loading the full paper ...