f82d26b15b55cee7.tex
1: \begin{abstract}
2: We combine two advanced ideas widely used in optimization for machine learning: \textit{shuffling} strategy and \textit{momentum} technique to develop a novel shuffling gradient-based method with momentum, coined \textbf{S}huffling \textbf{M}omentum \textbf{G}radient (SMG),  for non-convex finite-sum optimization problems. 
3: While our method is inspired by momentum techniques, its update is fundamentally different from existing momentum-based methods.
4: We establish state-of-the-art convergence rates of SMG for any shuffling strategy using either constant or diminishing learning rate under standard assumptions (i.e. \textit{$L$-smoothness} and \textit{bounded variance}).
5: When the shuffling strategy is fixed, we develop another new algorithm that is similar to existing momentum methods,
6: and prove the same convergence rates for this algorithm under the $L$-smoothness and bounded gradient assumptions. 
7: We demonstrate our algorithms via numerical simulations on standard datasets and compare them with existing shuffling methods.
8: Our tests have shown encouraging performance of the new algorithms.
9: \end{abstract}
10: