abstract:ac8c020bc2ba3524.tex

1: \begin{abstract}

2: This paper investigates accelerating the convergence of distributed optimization algorithms on non-convex problems.

3: We propose a distributed primal-dual stochastic gradient descent~(SGD) equipped with ``powerball'' method to accelerate.

4: We show that the proposed algorithm achieves the linear speedup convergence rate $\mathcal{O}(1/\sqrt{nT})$ for general smooth (possibly non-convex) cost functions.

5: We demonstrate the efficiency of the algorithm through numerical experiments by training two-layer fully connected neural networks and convolutional neural networks on the MNIST dataset to compare with state-of-the-art distributed SGD algorithms and centralized SGD algorithms.

6: \end{abstract}

7: