1: \begin{abstract}
2: Recent studies have shown that many nonconvex machine learning problems meet a so-called generalized-smooth condition that extends beyond traditional smooth nonconvex optimization. However, the existing algorithms designed for generalized-smooth nonconvex optimization encounter significant limitations in both their design and convergence analysis.
3: % Large language models and complex training paradigms challenge the assumption of L-smoothness in optimization, revealing the need for generalized smoothness conditions.
4: In this work, we first study deterministic generalized-smooth nonconvex optimization and analyze the convergence of normalized gradient descent under the generalized Polyak-{\L}ojasiewicz condition. Our results
5: provide a comprehensive understanding of the interplay between gradient normalization and function geometry.
6: %introduce and analyze how $\beta$-normalized gradient descent hyper-parameter choices and learning objective geometry characterized by generalized smoothness and P{\L} condition impact convergence.
7: Then, for stochastic generalized-smooth nonconvex optimization, we propose an independently-normalized stochastic gradient descent algorithm, which leverages independent sampling, gradient normalization and clipping to achieve an $\mathcal{O}(\epsilon^{-4})$ sample complexity under relaxed assumptions. Experiments demonstrate the fast convergence of our algorithm.
8: \end{abstract}
9: