bcae135c172deac6.tex
1: \begin{abstract}
2: This paper studies the role of data homogeneity on multi-agent optimization. Concentrating on the decentralized stochastic gradient ({\sf DSGD}) algorithm, we characterize the transient time, defined as the minimum number of iterations required such that {\sf DSGD} can achieve the comparable performance as its centralized counterpart. When the Hessians for the objective functions are identical at different agents, we show that the transient time of {\sf DSGD} is ${\cal O}( n^{4/3} / \rho^{8/3} )$ for smooth (possibly non-convex) objective functions, where $n$ is the number of agents and $\rho$ is the spectral gap of connectivity graph. This is improved over the bound of ${\cal O}( n^2 / \rho^4 )$ without the Hessian homogeneity assumption. Our analysis leverages a property that the objective function is twice continuously differentiable. Numerical experiments are presented to illustrate the essence of data homogeneity to fast convergence of {\sf DSGD}.
3: \end{abstract}
4: