1: \begin{abstract}
2: Majorization-minimization algorithms consist of successively minimizing a
3: sequence of upper bounds of the objective function. These upper bounds are
4: tight at the current estimate, and each iteration monotonically drives the
5: objective function downhill. Such a simple principle is widely applicable and
6: has been very popular in various scientific fields, especially in signal
7: processing and statistics. We propose an incremental majorization-minimization
8: scheme for minimizing a large sum of continuous functions, a problem of utmost
9: importance in machine learning. We present convergence guarantees for
10: non-convex and convex optimization when the upper bounds approximate the
11: objective up to a smooth error; we call such upper bounds ``first-order
12: surrogate functions''. More precisely, we study asymptotic stationary point
13: guarantees for non-convex problems, and for convex ones, we provide convergence
14: rates for the expected objective function value. We apply our scheme to
15: composite optimization and obtain a new incremental proximal gradient algorithm
16: with linear convergence rate for strongly convex functions. Our experiments
17: show that our method is competitive with the state of the art for solving
18: machine learning problems such as logistic regression when the number of
19: training samples is large enough, and we demonstrate its usefulness for sparse
20: estimation with non-convex penalties.
21: \end{abstract}