abstract:1ffc2c796dadaac0.tex

1: \begin{abstract}

2: Solving a bilevel optimization problem is at the core of several machine learning problems such as hyperparameter tuning, data denoising, meta- and few-shot learning, and training-data poisoning.

3: Different from simultaneous or multi-objective optimization, the steepest descent direction for minimizing the upper-level cost in a bilevel problem requires the inverse of the Hessian of the lower-level cost.

4: In this work, we propose a novel algorithm for solving bilevel optimization problems based on the classical penalty function approach.

5: Our method avoids computing the Hessian inverse and can handle constrained bilevel problems easily.

6: We prove the convergence of the method under mild conditions and show that the exact hypergradient is obtained asymptotically.

7: Our method's simplicity and small space and time complexities enable us to effectively solve large-scale bilevel problems involving deep neural networks.

8: We present results on data denoising, few-shot learning, and training-data poisoning problems in a large-scale setting.

9: Our results show that our approach outperforms or is comparable to previously proposed methods based on automatic differentiation and approximate inversion in terms of accuracy, run-time, and convergence speed.

10: \end{abstract}

11: