1: \begin{abstract} % Abstract of not more than 200 words.
2: This work studies the distributed empirical risk minimization (ERM) problem under differential privacy (DP) constraint. Existing algorithms achieve DP typically by perturbing every local full gradient with noise, leading to significantly degenerated utility compared to centralized algorithms.
3: To tackle this issue, we first introduce a node sampling strategy to distributed ERM, where a small fraction of nodes are randomly selected at each iteration to perform optimization.
4: Then, we develop a class of distributed dual averaging (DDA) algorithms under this framework, where only the stochastic subgradients over individual data samples within the activated nodes are perturbed to amplify the DP guarantee.
5: We prove that the proposed algorithms have utility loss comparable to existing centralized private algorithms for both general and strongly convex problems. When removing the noise, our algorithm attains the optimal $\mathcal{O}(1/t)$ convergence for non-smooth stochastic optimization.
6: % , for which DDA was known to converge at $\mathcal{O}(1/\sqrt{t})$ prior to our work.
7: Finally, experimental results on two benchmark datasets are given to verify the effectiveness of the proposed algorithms.
8: \end{abstract}
9: