0d2eb1751bad4741.tex
1: \begin{abstract}
2: Polyak-\L ojasiewicz (PL) \citep{POLYAK1963864} condition is a weaker condition than the strong convexity but suffices to ensure a global convergence for the Gradient Descent algorithm. In this paper, we study the lower bound of algorithms using first-order oracles to find an approximate optimal solution. We show that any first-order algorithm  requires at least  ${\Omega}\left(\frac{L}{\mu}\log\frac{1}{\eps}\right)$ gradient costs to find an  $\eps$-approximate optimal solution for  a general $L$-smooth function that has an $\mu$-PL constant. This result demonstrates the \textit{optimality} of the Gradient Descent algorithm to minimize smooth PL functions in the sense that there exists a ``hard'' PL function such that no first-order algorithm can be faster than Gradient Descent when ignoring a numerical constant. In contrast, it is well-known that the momentum technique, e.g.  \citep[chap.~2]{nesterov2003introductory} can provably accelerate Gradient Descent to ${O}\left(\sqrt{\frac{L}{\hat{\mu}}}\log\frac{1}{\eps}\right)$ gradient costs for functions that are $L$-smooth and $\hat{\mu}$-strongly convex. Therefore, our result distinguishes the hardness of minimizing a smooth PL function and a smooth strongly convex function as the complexity of the  former cannot be improved by any polynomial order in general. 
3:   
4: \iffalse
5:   The abstract paragraph should be indented \nicefrac{1}{2}~inch (3~picas) on
6:   both the left- and right-hand margins. Use 10~point type, with a vertical
7:   spacing (leading) of 11~points.  The word \textbf{Abstract} must be centered,
8:   bold, and in point size 12. Two line spaces precede the abstract. The abstract
9:   must be limited to one paragraph.
10:   \fi
11: \end{abstract}
12: