abstract:cfc8c4f9181371c2.tex

1: \begin{abstract}

2: Bilevel optimization recently has attracted increased interest in machine

3: learning due to its many applications such as hyper-parameter optimization and meta learning.

4: Although many bilevel methods recently have been proposed,

5: these methods do not consider using adaptive learning rates. It is well known that adaptive learning rates can accelerate

6: optimization algorithms.

7: To fill this gap, in the paper, we propose a novel fast adaptive bilevel framework to solve stochastic bilevel optimization problems that

8: the outer problem is possibly nonconvex and the inner problem is strongly convex.

9: Our framework uses unified adaptive matrices including many types of adaptive learning rates, and

10: can flexibly use the momentum and variance reduced techniques.

11: In particular, we provide a useful convergence analysis framework for the bilevel optimization.

12: Specifically, we propose a fast single-loop adaptive bilevel optimization (BiAdam) algorithm,

13: which achieves a sample complexity of $\tilde{O}(\epsilon^{-4})$ for finding an $\epsilon$-stationary solution.

14: Meanwhile, we propose an accelerated version of BiAdam algorithm (VR-BiAdam), which reaches

15: the best known sample complexity of $\tilde{O}(\epsilon^{-3})$.

16: To the best of our knowledge, we first

17: study the adaptive bilevel optimization methods with adaptive learning rates.

18: Experimental results on data hyper-cleaning and hyper-representation learning tasks demonstrate the efficiency of our algorithms.

19: \end{abstract}

20: