713a8440eac7e95a.tex
1: \begin{abstract}
2: This paper introduces the Metric-Free Natural Gradient (MFNG) algorithm for
3: training Boltzmann Machines. Similar in spirit to the Hessian-Free
4: method of \citet{martens2010hessian}, our algorithm belongs to the family
5: of truncated Newton methods and exploits an efficient matrix-vector product
6: to avoid explicitly storing the natural gradient metric $L$. This metric
7: is shown to be the expected second derivative of the log-partition function
8: (under the model distribution), or equivalently, the covariance of the vector of
9: partial derivatives of the energy function. We evaluate our method on the
10: task of joint-training a 3-layer Deep Boltzmann Machine and show that
11: MFNG does indeed have faster per-epoch convergence compared to Stochastic
12: Maximum Likelihood with centering, though wall-clock performance is
13: currently not competitive.
14: \end{abstract}
15: