abstract:9251181ef3b94a13.tex

1: \begin{abstract}

2: We apply stochastic average gradient (SAG) algorithms for training conditional random fields (CRFs).

3: %SAG algorithms are the first general stochastic gradient algorithm to have linear convergence rates.

4: We describe a practical implementation that uses structure in the CRF gradient to reduce the memory requirement of this linearly-convergent stochastic gradient method, propose a non-uniform sampling scheme that substantially improves practical performance, and analyze the rate of convergence of the SAGA variant under non-uniform sampling. Our experimental results reveal that our method often significantly outperforms existing methods in terms of the training objective, and performs as well or better than optimally-tuned stochastic gradient methods in terms of test error.

5: \end{abstract}