a899ee67f64b3214.tex
1: \begin{abstract}
2: \textcolor{black}{
3: The optimistic gradient method is useful in addressing} minimax optimization \textcolor{black}{problems}.
4: Motivated by 
5: the observation that the conventional stochastic \textcolor{black}{version} suffers from \textcolor{black}{the need for a}
6:  large 
7: batch size  \textcolor{black}{on the order of} $\mathcal{O}(\varepsilon^{-2})$ \textcolor{black}{to achieve} an $\varepsilon$-stationary solution, we introduce and analyze 
8: a new formulation termed Diffusion Stochastic Same-Sample Optimistic Gradient (DSS-OG). We prove its convergence and 
9: \textcolor{black}{resolve} the large batch issue
10: by establishing
11: a tighter upper bound, under  
12: the more general \textcolor{black}{setting of} nonconvex Polyak-Lojasiewicz (PL) \textcolor{black}{risk functions}.
13: We also extend the applicability of the proposed method to  {\color{black}the} distributed \textcolor{black}{scenario}, where agents communicate with their neighbors via a left-stochastic protocol.
14: To implement DSS-OG,
15: we can query the stochastic gradient oracles
16: in parallel with some extra memory overhead, resulting in a complexity comparable to its conventional counterpart. To demonstrate the efficacy of \textcolor{black}{the} proposed algorithm, we conduct tests by training generative adversarial networks.
17: \end{abstract}
18: