cfc8c4f9181371c2.tex
1: \begin{abstract}
2: Bilevel optimization recently has attracted increased interest in machine
3: learning due to its many applications such as hyper-parameter optimization and meta learning.
4: Although many bilevel methods recently have been proposed,
5: these methods do not consider using adaptive learning rates. It is well known that adaptive learning rates can accelerate
6: optimization algorithms.
7: To fill this gap, in the paper, we propose a novel fast adaptive bilevel framework to solve stochastic bilevel optimization problems that
8: the outer problem is possibly nonconvex and the inner problem is strongly convex.
9: Our framework uses unified adaptive matrices including many types of adaptive learning rates, and
10: can flexibly use the momentum and variance reduced techniques.
11: In particular, we provide a useful convergence analysis framework for the bilevel optimization.
12: Specifically, we propose a fast single-loop adaptive bilevel optimization (BiAdam) algorithm,
13: which achieves a sample complexity of $\tilde{O}(\epsilon^{-4})$ for finding an $\epsilon$-stationary solution.
14: Meanwhile, we propose an accelerated version of BiAdam algorithm (VR-BiAdam), which reaches
15: the best known sample complexity of $\tilde{O}(\epsilon^{-3})$.
16: To the best of our knowledge, we first
17: study the adaptive bilevel optimization methods with adaptive learning rates.
18: Experimental results on data hyper-cleaning and hyper-representation learning tasks demonstrate the efficiency of our algorithms.
19: \end{abstract}
20: