3ec4104d9fde26cd.tex
1: \begin{abstract}
2:    Majorization-minimization algorithms consist of successively minimizing a
3:    sequence of upper bounds of the objective function. These upper bounds are
4:    tight at the current estimate, and each iteration monotonically drives the
5:    objective function downhill. Such a simple principle is widely applicable and
6:    has been very popular in various scientific fields, especially in signal
7:    processing and statistics. We propose an incremental majorization-minimization
8:    scheme for minimizing a large sum of continuous functions, a problem of utmost
9:    importance in machine learning. We present convergence guarantees for
10:    non-convex and convex optimization when the upper bounds approximate the
11:    objective up to a smooth error; we call such upper bounds ``first-order
12:    surrogate functions''. More precisely, we study asymptotic stationary point
13:    guarantees for non-convex problems, and for convex ones, we provide convergence
14:    rates for the expected objective function value. We apply our scheme to
15:    composite optimization and obtain a new incremental proximal gradient algorithm
16:    with linear convergence rate for strongly convex functions. Our experiments
17:    show that our method is competitive with the state of the art for solving
18:    machine learning problems such as logistic regression when the number of
19:    training samples is large enough, and we demonstrate its usefulness for sparse
20:    estimation with non-convex penalties.
21: \end{abstract}