a6bf32a2db0fc3af.tex
1: \begin{abstract}
2:     One of the objectives of continual learning is to prevent catastrophic forgetting in learning multiple tasks sequentially,
3:     and the existing solutions have been driven by the conceptualization of the plasticity-stability dilemma.
4:     However, the convergence of continual learning for each sequential task is less studied so far.
5:     In this paper, we provide a convergence analysis of memory-based continual learning with stochastic gradient descent
6:     and empirical evidence that training current tasks causes the cumulative degradation of previous tasks.
7:     We propose an adaptive method for nonconvex continual learning (NCCL), which adjusts step sizes of both previous and current tasks with the gradients.
8:     The proposed method can achieve the same convergence rate as the SGD method when the catastrophic forgetting term which we define in the paper is suppressed at each iteration.
9:     Further, we demonstrate that the proposed algorithm improves the performance of continual learning over existing methods for several image classification tasks.
10: \end{abstract}
11: