abstract:713a8440eac7e95a.tex

1: \begin{abstract}

2: This paper introduces the Metric-Free Natural Gradient (MFNG) algorithm for

3: training Boltzmann Machines. Similar in spirit to the Hessian-Free

4: method of \citet{martens2010hessian}, our algorithm belongs to the family

5: of truncated Newton methods and exploits an efficient matrix-vector product

6: to avoid explicitly storing the natural gradient metric $L$. This metric

7: is shown to be the expected second derivative of the log-partition function

8: (under the model distribution), or equivalently, the covariance of the vector of

9: partial derivatives of the energy function. We evaluate our method on the

10: task of joint-training a 3-layer Deep Boltzmann Machine and show that

11: MFNG does indeed have faster per-epoch convergence compared to Stochastic

12: Maximum Likelihood with centering, though wall-clock performance is

13: currently not competitive.

14: \end{abstract}

15: