004fe5cdfa946a35.tex
1: \begin{abstract}
2: Existing analysis of AdaGrad and other adaptive methods for smooth
3: convex optimization is typically for functions with bounded domain
4: diameter. In unconstrained problems, previous works guarantee an asymptotic
5: convergence rate without an explicit constant factor that holds true
6: for the entire function class. Furthermore, in the stochastic setting,
7: only a modified version of AdaGrad, different from the one commonly
8: used in practice, in which the latest gradient is not used to update
9: the stepsize, has been analyzed. Our paper aims at bridging these
10: gaps and developing a deeper understanding of AdaGrad and its variants
11: in the standard setting of smooth convex functions as well as the
12: more general setting of quasar convex functions. First, we demonstrate
13: new techniques to explicitly bound the convergence rate of the vanilla
14: AdaGrad for unconstrained problems in both deterministic and stochastic
15: settings. Second, we propose a variant of AdaGrad for which we can
16: show the convergence of the last iterate, instead of the average iterate.
17: Finally, we give new accelerated adaptive algorithms and their convergence
18: guarantee in the deterministic setting with explicit dependency on
19: the problem parameters, improving upon the asymptotic rate shown in
20: previous works. 
21: \end{abstract}
22: