abstract:f82d26b15b55cee7.tex

1: \begin{abstract}

2: We combine two advanced ideas widely used in optimization for machine learning: \textit{shuffling} strategy and \textit{momentum} technique to develop a novel shuffling gradient-based method with momentum, coined \textbf{S}huffling \textbf{M}omentum \textbf{G}radient (SMG),  for non-convex finite-sum optimization problems.

3: While our method is inspired by momentum techniques, its update is fundamentally different from existing momentum-based methods.

4: We establish state-of-the-art convergence rates of SMG for any shuffling strategy using either constant or diminishing learning rate under standard assumptions (i.e. \textit{$L$-smoothness} and \textit{bounded variance}).

5: When the shuffling strategy is fixed, we develop another new algorithm that is similar to existing momentum methods,

6: and prove the same convergence rates for this algorithm under the $L$-smoothness and bounded gradient assumptions.

7: We demonstrate our algorithms via numerical simulations on standard datasets and compare them with existing shuffling methods.

8: Our tests have shown encouraging performance of the new algorithms.

9: \end{abstract}

10: