abstract:5d876dfd379b67bd.tex

1: \begin{abstract}

2:   The minimization of convex objectives coming from linear supervised learning problems, such as

3:   penalized generalized linear models, can be formulated as finite sums of convex functions.

4:   For such problems, a large set of stochastic first-order solvers based on the idea of variance

5:   reduction are available and combine both computational efficiency and sound theoretical

6:   guarantees (linear convergence rates) \cite{johnson2013accelerating},

7:   \cite{schmidt2013minimizing}, \cite{shalev2013stochastic}, \cite{defazio2014saga}.

8:   Such rates are obtained under both gradient-Lipschitz and strong convexity

9:   assumptions.

10:   Motivated by learning problems that do not meet the gradient-Lipschitz assumption, such as

11:   linear Poisson regression, we work under another smoothness assumption, and

12:   obtain a linear convergence rate for a shifted version of Stochastic Dual Coordinate Ascent

13:   (SDCA) \cite{shalev2013stochastic} that improves the current state-of-the-art.

14:   Our motivation for considering a solver working on the Fenchel-dual problem comes from the

15:   fact that such objectives include many linear constraints, that are easier to deal with in the

16:   dual.

17:   Our approach and theoretical findings are validated on several datasets, for Poisson regression

18:   and another objective coming from the negative log-likelihood of the Hawkes process, which is a

19:   family of models which proves extremely useful for the modeling of information

20:   propagation in social networks and causality

21:   inference \cite{de2016learning}, \cite{farajtabar2015coevolve}.

22: \end{abstract}

23: