25d42ee5b8d9343e.tex
1: \begin{abstract}
2:   We propose and study several server-extrapolation strategies for enhancing the theoretical and empirical convergence properties of the popular federated learning optimizer {\FEDPROX} \citep{li2020federated}. 
3:   While it has long been known that some form of extrapolation can help in the practice of FL, only a handful of works provide any theoretical guarantees. 
4:   The phenomenon seems elusive, and our current theoretical understanding remains severely incomplete. 
5:   In our work, we focus on smooth convex or strongly convex problems in the interpolation regime. 
6:   In particular, we propose Extrapolated {\FEDPROX} ({\FEDEXPROX}), and study three extrapolation strategies: a constant strategy (depending on various smoothness parameters and the number of participating devices), and two smoothness-adaptive strategies; one based on the notion of gradient diversity ({\FEDEXPROXG}), and the other one based on the stochastic Polyak stepsize ({\FEDEXPROXS}).
7:   Our theory is corroborated with carefully constructed numerical experiments.
8: \end{abstract}
9: