abstract:6b913a143b9e1467.tex

1: \begin{abstract}

2:   We provide the first convergence guarantee for black-box variational inference (BBVI) with the reparameterization gradient.

3:   While preliminary investigations worked on simplified versions of BBVI (\textit{e.g.}, bounded domain, bounded support, only optimizing for the scale, and such), our setup does not need any such algorithmic modifications.

4:   Our results hold for log-smooth posterior densities with and without strong log-concavity and the location-scale variational family.

5:   Notably, our analysis reveals that certain algorithm design choices commonly employed in practice, such as nonlinear parameterizations of the scale matrix, can result in suboptimal convergence rates.

6:   Fortunately, running BBVI with proximal stochastic gradient descent fixes these limitations and thus achieves the strongest known convergence guarantees.

7:   We evaluate this theoretical insight by comparing proximal SGD against other standard implementations of BBVI on large-scale Bayesian inference problems.

8: \end{abstract}

9: