f3e70aec257381c4.tex
1: \begin{abstract}
2: Federated learning (FL) is a fast-developing technique that allows multiple workers to train a global model based on a distributed dataset.  %{\color{blue}Since the data is stored and collected at the edge worker directly and workers do not share their raw data, FL faces non-independent and identically distributed (non-i.i.d.) data and improves data privacy. }
3: Conventional FL employs gradient descent algorithm, which may not be efficient enough. It is well known that Nesterov Accelerated Gradient (NAG) is more advantageous in centralized training environment, but it is not clear how to quantify the benefits of NAG in FL so far. 
4: %and no prior work has rigorously analyzed convergence of nesterov accelerated gradient (NAG) in the FL environment. It is also not clear how to quantify the performance gap between SGD and NAG in the FL environment.
5: In this work, we focus on a version of FL based on NAG (FedNAG) and provide a detailed convergence analysis. The result is compared with conventional FL based on gradient descent. One interesting conclusion is that as long as the learning step size is sufficiently small, FedNAG outperforms FedAvg. Extensive experiments based on real-world datasets are conducted, verifying our conclusions and confirming the better convergence performance of FedNAG.
6: \end{abstract}
7: