abstract:a4cf098d8f3615a6.tex

1: \begin{abstract}

2: Decentralized learning recently has received increasing attention in machine learning due to its advantages in implementation simplicity and system robustness, data privacy. Meanwhile,

3: the adaptive gradient methods show superior performances in many machine learning tasks such as training neural networks.

4: Although some works focus on studying decentralized optimization algorithms with adaptive learning rates,

5: these adaptive decentralized algorithms still suffer from high sample complexity.

6: To fill these gaps, we propose a class of faster adaptive decentralized algorithms (i.e.,

7: AdaMDOS and AdaMDOF) for distributed nonconvex stochastic and finite-sum optimization, respectively. Moreover, we provide a solid convergence analysis framework for our methods.

8: In particular, we prove that our AdaMDOS obtains a near-optimal sample complexity of $\tilde{O}(\epsilon^{-3})$

9: for finding an $\epsilon$-stationary solution of nonconvex stochastic optimization.

10: Meanwhile, our AdaMDOF obtains a near-optimal sample complexity of $O(\sqrt{n}\epsilon^{-2})$ for finding an $\epsilon$-stationary solution of nonconvex finite-sum optimization, where $n$ denotes the sample size. To the best of our knowledge, our AdaMDOF algorithm is the first adaptive decentralized algorithm for nonconvex finite-sum optimization.

11: Some experimental results demonstrate  efficiency of our algorithms.

12: \end{abstract}

13: