1ffc2c796dadaac0.tex
1: \begin{abstract}
2: Solving a bilevel optimization problem is at the core of several machine learning problems such as hyperparameter tuning, data denoising, meta- and few-shot learning, and training-data poisoning. 
3: Different from simultaneous or multi-objective optimization, the steepest descent direction for minimizing the upper-level cost in a bilevel problem requires the inverse of the Hessian of the lower-level cost. 
4: In this work, we propose a novel algorithm for solving bilevel optimization problems based on the classical penalty function approach. 
5: Our method avoids computing the Hessian inverse and can handle constrained bilevel problems easily.  
6: We prove the convergence of the method under mild conditions and show that the exact hypergradient is obtained asymptotically. 
7: Our method's simplicity and small space and time complexities enable us to effectively solve large-scale bilevel problems involving deep neural networks. 
8: We present results on data denoising, few-shot learning, and training-data poisoning problems in a large-scale setting. 
9: Our results show that our approach outperforms or is comparable to previously proposed methods based on automatic differentiation and approximate inversion in terms of accuracy, run-time, and convergence speed.
10: \end{abstract}
11: