abstract:6c10a3342a129044.tex

1: \begin{abstract}

2:   Recently a majorization method for optimizing partition functions of

3:   log-linear models was proposed alongside a novel quadratic

4:   variational upper-bound. In the batch setting, it outperformed

5:   state-of-the-art first- and second-order optimization methods on

6:   various learning tasks. We propose a stochastic version of this

7:   bound majorization method as well as a low-rank modification for

8:   high-dimensional data-sets. The resulting stochastic second-order

9:   method outperforms stochastic gradient descent (across variations

10:   and various tunings) both in terms of the number of iterations and

11:   computation time till convergence while finding a better quality

12:   parameter setting. The proposed method bridges first- and

13:   second-order stochastic optimization methods by maintaining a

14:   computational complexity that is linear in the data dimension and

15:   while exploiting second order information about the pseudo-global

16:   curvature of the objective function (as opposed to the local

17:   curvature in the Hessian).

18: \end{abstract}

19: