abstract:d74c15e0c971de51.tex

1: \begin{abstract}

2: Bilevel optimization has wide applications such as hyperparameter tuning, neural architecture search, and meta-learning.

3: Designing efficient algorithms for bilevel optimization is challenging because the lower-level problem defines a feasibility set implicitly via another optimization problem.

4: In this work, we consider one

5: tractable case when the lower-level problem is strongly convex. Recent works show that with a Hessian-vector product oracle, one can provably find

6: an $\epsilon$-first-order stationary point

7: within $\tilde{\mathcal{O}}(\epsilon^{-2})$ oracle calls. However, Hessian-vector product may be inaccessible or expensive in practice.

8: Kwon et al. (ICML 2023) addressed this issue by proposing a  first-order method that can achieve the same goal at a slower

9: rate of $\tilde{\mathcal{O}}(\epsilon^{-3})$.

10: In this work, we provide a tighter analysis demonstrating that this method can converge at the near-optimal $\tilde {\mathcal{O}}(\epsilon^{-2})$ rate as second-order methods.

11: Our analysis further leads to

12: simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points and for distributed bilevel problems.

13: \end{abstract}

14: