95a6cf49e95b0031.tex
1: \begin{abstract}
2:     \Ac{VI} seeks to approximate a target distribution $\pi$ by an element of a tractable family of distributions. 
3:     Of key interest in statistics and machine learning is Gaussian VI,
4:     which approximates $\pi$ by minimizing
5:     the \ac{KL} divergence to\ $\pi$ over the space of Gaussians.
6:     In this work, we develop %
7:     the (Stochastic) Forward-Backward Gaussian Variational Inference (FB--GVI) algorithm to solve Gaussian VI.
8:     Our approach exploits the composite structure of the \ac{KL} divergence, which can be written as the sum of a smooth term (the potential) and a non-smooth term (the entropy) over the \ac{BW} space of Gaussians endowed with the Wasserstein distance.
9:     For our proposed algorithm,
10:     we obtain state-of-the-art convergence guarantees when $\pi$ is log-smooth and log-concave,
11:     as well as the first convergence guarantees to first-order stationary solutions when $\pi$ is only log-smooth.
12: \end{abstract}
13: