On the Convergence of Continual Learning with Adaptive Methods

Abstract

One of the objectives of continual learning is to prevent catastrophicforgetting in learning multiple tasks sequentially, and the existing solutionshave been driven by the conceptualization of the plasticity-stability dilemma.However, the convergence of continual learning for each sequential task is lessstudied so far. In this paper, we provide a convergence analysis ofmemory-based continual learning with stochastic gradient descent and empiricalevidence that training current tasks causes the cumulative degradation ofprevious tasks. We propose an adaptive method for nonconvex continual learning(NCCL), which adjusts step sizes of both previous and current tasks with thegradients. The proposed method can achieve the same convergence rate as the SGDmethod when the catastrophic forgetting term which we define in the paper issuppressed at each iteration. Further, we demonstrate that the proposedalgorithm improves the performance of continual learning over existing methodsfor several image classification tasks.

Quick Read (beta)

loading the full paper ...