1: \begin{abstract}
2: Stochastic gradient MCMC methods, such as stochastic gradient Langevin dynamics (SGLD), enable large-scale posterior inference by leveraging noisy but cheap gradient estimates. However, when federated data are non-IID, the variance of distributed gradient estimates is amplified compared to its centralized version, and delayed communication rounds lead chains to diverge from the target posterior.
3: %
4: In this work, we introduce the concept of \emph{conducive gradients}, zero-mean stochastic gradients that serve as a mechanism for sharing probabilistic information between data shards.
5: %
6: We propose a novel stochastic gradient estimator that incorporates the conducive gradients, and we show that it improves convergence on federated data when compared to distributed SGLD (DSGLD).
7: %
8: We evaluate, \emph{conducive gradient DSGLD} (CG-DSGLD) on metric learning and deep MLPs tasks.
9: Experiments show that it outperforms standard DSGLD for non-IID federated data.
10: \end{abstract}
11: