1: \begin{abstract}
2: One of the objectives of continual learning is to prevent catastrophic forgetting in learning multiple tasks sequentially,
3: and the existing solutions have been driven by the conceptualization of the plasticity-stability dilemma.
4: However, the convergence of continual learning for each sequential task is less studied so far.
5: In this paper, we provide a convergence analysis of memory-based continual learning with stochastic gradient descent
6: and empirical evidence that training current tasks causes the cumulative degradation of previous tasks.
7: We propose an adaptive method for nonconvex continual learning (NCCL), which adjusts step sizes of both previous and current tasks with the gradients.
8: The proposed method can achieve the same convergence rate as the SGD method when the catastrophic forgetting term which we define in the paper is suppressed at each iteration.
9: Further, we demonstrate that the proposed algorithm improves the performance of continual learning over existing methods for several image classification tasks.
10: \end{abstract}
11: