ed88ef432db4aaf2.tex
1: \begin{abstract}
2:   Federated Learning (FL) is a distributed learning paradigm that scales on-device learning collaboratively and privately. 
3:   Standard FL algorithms such as \fedavg are primarily geared towards \emph{smooth unconstrained} settings. 
4:   In this paper, we study the \emph{Federated Composite Optimization} (FCO) problem, in which the loss function contains a non-smooth regularizer. 
5:   Such problems arise naturally in FL applications that involve sparsity, low-rank, monotonicity, or more general constraints. 
6:   We first show that straightforward extensions of primal algorithms such as \fedavg are not well-suited for FCO since they suffer from the ``curse of primal averaging,'' resulting in poor convergence.
7:   As a solution, we propose a new primal-dual algorithm, \emph{Federated Dual Averaging} (\feddualavg), which by employing a novel server dual averaging procedure
8:   circumvents the curse of primal averaging.
9:   Our theoretical analysis and empirical experiments demonstrate that \feddualavg outperforms the other baselines.
10: \end{abstract}
11: