abstract:ec125aaba4dd0742.tex

1: \begin{abstract}

2: Stochastic gradient MCMC methods,  such as stochastic gradient Langevin dynamics (SGLD), enable large-scale posterior inference by leveraging noisy but cheap gradient estimates.  However, when federated data are non-IID,  the variance of distributed gradient estimates is amplified compared to its centralized version, and delayed communication rounds lead chains to diverge from the target posterior.

3: %

4: In this work, we introduce the concept of \emph{conducive gradients}, zero-mean stochastic gradients that serve as a mechanism for sharing probabilistic information between data shards.

5: %

6: We propose a novel stochastic gradient estimator that incorporates the conducive gradients, and we show that it improves convergence on federated data when compared to distributed SGLD (DSGLD).

7: %

8: We evaluate, \emph{conducive gradient DSGLD} (CG-DSGLD) on metric learning and deep MLPs tasks.

9: Experiments show that it outperforms standard DSGLD for non-IID federated data.

10: \end{abstract}

11: