50d05df5b451f54e.tex
1: \begin{abstract}
2: We consider a standard federated learning architecture where a group of clients periodically coordinate with a central server to train a statistical model. We tackle two major challenges in federated learning: (i) objective heterogeneity, which stems from differences in the clients' local loss functions, and (ii) systems heterogeneity, which leads to slow and straggling client devices. Due to such client heterogeneity, we show that existing federated learning  algorithms suffer from a fundamental speed-accuracy conflict: they either guarantee linear convergence but to an incorrect point, or convergence to the global minimum but at a sub-linear rate, i.e., fast convergence comes at the expense of accuracy. 
3: 
4: To address the above limitation, we propose \texttt{FedLin} - a simple, new algorithm that exploits past gradients and employs  client-specific learning rates. When the clients' local loss functions are smooth and strongly convex, we show that \texttt{FedLin}  guarantees linear convergence to the global minimum. We then establish matching upper and lower bounds on the convergence rate of \texttt{FedLin} that highlight the trade-offs associated with infrequent, periodic communication. Notably, \texttt{FedLin} is the only approach that is able to match centralized convergence rates (up to constants) for smooth strongly convex, convex, and non-convex loss functions despite arbitrary  objective and systems heterogeneity. We further show that \texttt{FedLin} preserves linear convergence rates under  aggressive gradient sparsification, and quantify the effect of the compression level on the convergence rate. 
5: \end{abstract}
6: