1: \begin{abstract}
2: Bilevel optimization has wide applications such as hyperparameter tuning, neural architecture search, and meta-learning.
3: Designing efficient algorithms for bilevel optimization is challenging because the lower-level problem defines a feasibility set implicitly via another optimization problem.
4: In this work, we consider one
5: tractable case when the lower-level problem is strongly convex. Recent works show that with a Hessian-vector product oracle, one can provably find
6: an $\epsilon$-first-order stationary point
7: within $\tilde{\mathcal{O}}(\epsilon^{-2})$ oracle calls. However, Hessian-vector product may be inaccessible or expensive in practice.
8: Kwon et al. (ICML 2023) addressed this issue by proposing a first-order method that can achieve the same goal at a slower
9: rate of $\tilde{\mathcal{O}}(\epsilon^{-3})$.
10: In this work, we provide a tighter analysis demonstrating that this method can converge at the near-optimal $\tilde {\mathcal{O}}(\epsilon^{-2})$ rate as second-order methods.
11: Our analysis further leads to
12: simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points and for distributed bilevel problems.
13: \end{abstract}
14: