d774587b3606707a.tex
1: \begin{abstract}
2: Bilevel optimization is a popular hierarchical model in machine learning, and
3:  has been widely applied to many machine learning tasks such as meta learning,
4: hyperparameter learning and policy optimization. Although many bilevel
5: optimization algorithms recently have been developed, few adaptive algorithm focuses on
6: the bilevel optimization under the distributed setting. It is well known that the adaptive gradient methods show superior performances on both distributed and non-distributed optimization.
7: In the paper, thus, we propose a novel adaptive federated bilevel optimization algorithm (i.e.,AdaFBiO) to solve the distributed bilevel optimization problems,
8: where the objective function of Upper-Level (UL) problem is possibly nonconvex, and that of Lower-Level (LL) problem is strongly convex.
9: Specifically, our AdaFBiO algorithm builds on the momentum-based variance reduced technique and local-SGD
10: to obtain the best known sample and communication complexities simultaneously.
11: In particular, our AdaFBiO algorithm uses the unified adaptive matrices to flexibly incorporate various adaptive learning rates to
12: update variables in both UL and LL problems.
13: Moreover, we provide a convergence analysis framework for our AdaFBiO algorithm, and prove it needs the sample complexity of $\tilde{O}(\epsilon^{-3})$
14: with communication complexity of $\tilde{O}(\epsilon^{-2})$ to obtain an $\epsilon$-stationary point.
15:  Experimental results on federated hyper-representation learning and federated data hyper-cleaning tasks verify efficiency of our algorithm.
16: \end{abstract}
17: