Overcoming Knowledge Barriers: Online Imitation Learning from Observation with Pretrained World Models

Abstract

Incorporating the successful paradigm of pretraining and finetuning fromComputer Vision and Natural Language Processing into decision-making has becomeincreasingly popular in recent years. In this paper, we study ImitationLearning from Observation with pretrained models and find existing approachessuch as BCO and AIME face knowledge barriers, specifically the EmbodimentKnowledge Barrier (EKB) and the Demonstration Knowledge Barrier (DKB), greatlylimiting their performance. The EKB arises when pretrained models lackknowledge about unseen observations, leading to errors in action inference. TheDKB results from policies trained on limited demonstrations, hinderingadaptability to diverse scenarios. We thoroughly analyse the underlyingmechanism of these barriers and propose AIME-v2 upon AIME as a solution.AIME-v2 uses online interactions with data-driven regulariser to alleviate theEKB and mitigates the DKB by introducing a surrogate reward function to enhancepolicy training. Experimental results on tasks from the DeepMind Control Suiteand Meta-World benchmarks demonstrate the effectiveness of these modificationsin improving both sample-efficiency and converged performance. The studycontributes valuable insights into resolving knowledge barriers for enhanceddecision-making in pretraining-based approaches. Code will be available athttps://github.com/argmax-ai/aime-v2.

Quick Read (beta)

loading the full paper ...