6b913a143b9e1467.tex
1: \begin{abstract}
2:   We provide the first convergence guarantee for black-box variational inference (BBVI) with the reparameterization gradient.
3:   While preliminary investigations worked on simplified versions of BBVI (\textit{e.g.}, bounded domain, bounded support, only optimizing for the scale, and such), our setup does not need any such algorithmic modifications.
4:   Our results hold for log-smooth posterior densities with and without strong log-concavity and the location-scale variational family.
5:   Notably, our analysis reveals that certain algorithm design choices commonly employed in practice, such as nonlinear parameterizations of the scale matrix, can result in suboptimal convergence rates.
6:   Fortunately, running BBVI with proximal stochastic gradient descent fixes these limitations and thus achieves the strongest known convergence guarantees.
7:   We evaluate this theoretical insight by comparing proximal SGD against other standard implementations of BBVI on large-scale Bayesian inference problems.
8: \end{abstract}
9: