1: \begin{abstract}% <- trailing '%' for backward compatibility of .sty file
2: Ill-conditioned problems are ubiquitous in large-scale machine learning:
3: as a dataset grows to include more and more features correlated with the labels,
4: the condition number increases.
5: Yet traditional stochastic gradient methods
6: converge slowly on these ill-conditioned problems,
7: even with careful hyperparameter tuning.
8: This paper introduces PROMISE (\textbf{Pr}econditioned Stochastic \textbf{O}ptimization \textbf{M}ethods by \textbf{I}ncorporating \textbf{S}calable Curvature \textbf{E}stimates), a suite of sketching-based preconditioned stochastic gradient algorithms
9: that deliver fast convergence on ill-conditioned large-scale convex optimization problems arising in machine learning.
10: PROMISE includes preconditioned versions of SVRG, SAGA, and Katyusha;
11: each algorithm comes with a strong theoretical analysis and
12: effective default hyperparameter values.
13: % In contrast, traditional stochastic gradient methods
14: % require careful hyperparameter tuning to succeed,
15: % and degrade in the presence of ill-conditioning,
16: % a ubiquitous phenomenon in machine learning.
17: Empirically, we verify the superiority of the proposed algorithms
18: by showing that, using default hyperparameter values,
19: they outperform or match popular \emph{tuned} stochastic gradient optimizers
20: on a test bed of $51$ ridge and logistic regression problems
21: assembled from benchmark machine learning repositories.
22: On the theoretical side, this paper introduces the notion of \emph{quadratic regularity}
23: in order to establish linear convergence of all proposed methods
24: even when the preconditioner is updated infrequently.
25: The speed of linear convergence is determined by the \emph{quadratic regularity ratio},
26: which often provides a tighter bound on the convergence rate compared to the condition number,
27: both in theory and in practice,
28: and explains the fast global linear convergence of the proposed methods.
29: \end{abstract}
30: