5a5364bc81f5317c.tex
1: \begin{abstract}
2: Online learning is a powerful tool for analyzing iterative algorithms. However, the classic adversarial setup sometimes fails to capture certain regularity in online problems in practice.
3: %	
4: Motivated by this, we establish a new setup, called Continuous Online Learning (COL), where the gradient of online loss function changes continuously across rounds with respect to the learner's decisions. 
5: %
6: We show that COL covers and more appropriately describes many interesting applications, from general equilibrium problems (EPs) to optimization in episodic MDPs. 
7: %
8: %
9: Using this new setup, we revisit the difficulty of achieving sublinear dynamic regret. 
10: %
11: We prove that there is a fundamental equivalence between achieving sublinear dynamic regret in COL and solving certain EPs, and we present a reduction from dynamic regret to both static regret and convergence rate of the associated EP.
12: % 
13: At the end, we specialize these new insights into online imitation learning and show improved understanding of its learning stability.
14: %optimization in episodic MDP, \red{including fitter Q-iteration and online imitation learning.}
15: \end{abstract}