1: \begin{abstract}
2: Understanding the convergence performance of asynchronous stochastic gradient descent method (Async-SGD) has received increasing attention in recent years due to their foundational role in machine learning.
3: To date, however, most of the existing works are restricted to either bounded gradient delays or convex settings.
4: %In this paper, we focus on Async-SGD for non-convex optimization problems with unbounded gradient delays.
5: In this paper, we focus on Async-SGD and its variant Async-SGDI (which uses increasing batch size) for non-convex optimization problems with unbounded gradient delays.
6: %We analyze the convergence performance of standard Async-SGD and an Async-SGD variant with increasing batch size (Async-SGDI) aiming for variance reduction.
7: %We prove asymptotic $o(1/\sqrt{k})$ convergence rate for the standard Async-SGD algorithm and $o(1/k)$ for Async-SGDI.
8: We prove $o(1/\sqrt{k})$ convergence rate for Async-SGD and $o(1/k)$ for Async-SGDI.
9: %Also, we develop a unifying sufficient condition for Async-SGD's convergence that includes two major gradient update delay models in the literature as special cases.
10: Also, a unifying sufficient condition for Async-SGD's convergence is established, which includes two major gradient delay models in the literature as special cases and yields a new delay model not considered thus far.
11: \end{abstract}
12: