1: \begin{abstract}
2: Federated learning aims to jointly learn statistical models over massively distributed remote devices.
3: In this work, we propose \feddane, an optimization method that we adapt from \dane~\cite{shamir2014communication,AIDE_reddi_16}, a method for classical distributed optimization, to handle the practical constraints of federated learning. We provide convergence guarantees for this method when learning over both convex and non-convex functions. Despite encouraging theoretical results, we find that the method has underwhelming performance empirically. In particular, through empirical simulations on both synthetic and real-world datasets, \feddane consistently underperforms baselines of \fedavg~\cite{mcmahan2016FedAvg} and \fedprox~\cite{tian2018federated} in realistic federated settings. We identify low device participation and statistical device heterogeneity as two underlying causes of this underwhelming performance, and conclude by suggesting several directions of future work.
4: \end{abstract}
5: