abstract:ccf7b105b4791599.tex

1: \begin{abstract}

2: % We study non-convex and convex optimization problems with generalized smooth functions. In particular, we focus on the $(L_0, L_1)$-smooth class, which is wider than the smooth class and is attractive due to machine learning applications. We improved the existing properties and presented new tools for analysis of optimization methods for $(L_0, L_1)$-smooth function. For non-convex problems, by minimizing an upper bound on the function value, we derived new step sizes for the gradient method and obtained

3: % % a $\mathcal{O}(\frac{L_1 (f(x_0) - f^*)}{\epsilon} + \frac{ L_0 (f(x_0) - f^*)}{\epsilon^2})$

4: % convergence rate matching existing results. For convex functions, we improved existing results and achieve the best-known

5: % % results and achieved

6: % % $\mathcal{O}(\frac{L_0 R_0^2 }{  \epsilon} + L_1^2 \|x_0 - x^*\|^2)$

7: % convergence rate for the gradient method with the proposed stepsizes. Motivated by growing interest in adaptive stepsizes, we analyzed the gradient method with Polyak stepsizes and normalized gradient method and achieved the best-known

8: % % $\mathcal{O}(\frac{L_0 R_0^2 }{  \epsilon} + L_1^2 \|x_0 - x^*\|^2)$

9: % convergence rate without additional information on parameters $L_0, L_1$. Finally, we proposed using the accelerated method only after some fixed number of iterations of the gradient method to achieve better rate.

10: % % to achieve $\mathcal{O}( \frac{\sqrt{ L_0} \|x_0 - x^*\|}{\sqrt{\epsilon}} + L_1^2 \|x_0 - x^*\|^2)$ total complexity.

11: % In our analysis, we do not use additional assumptions on $L$-Lipschitz continuity of the gradients. \an{Moreover, our convergence rate results do not have an exponential dependency on $L_0$ or $L_1$, and do not depend

12: % %on $L$ Lipschitz constant or

13: % on the gradient norm at the initial iterate.}

14: % \end{abstract}

15: