abstract:749b34b10d45d818.tex

1: \begin{abstract}

2: Let $V_* : \mathbb{R}^d \to \R$ be some (possibly non-convex) potential function, and consider the probability measure $\pi \propto e^{-V_*}$.

3: When $\pi$ exhibits multiple modes, it is known that sampling techniques based on Wasserstein gradient flows of the Kullback-Leibler (KL) divergence (e.g. Langevin Monte Carlo) suffer poorly in the rate of convergence, where the dynamics are unable to easily traverse between modes.

4: In stark contrast, the work of \cite{lu2019accelerating,lu2022birth}

5: has shown that the gradient flow of the KL with respect to the Fisher-Rao (FR) geometry exhibits a convergence rate to $\pi$ is that \textit{independent} of the potential function.

6: In this short note, we complement these existing results in the literature by providing an explicit expansion of $\text{KL}(\rho_t^{\text{FR}}\|{\pi})$ in terms of $e^{-t}$, where $(\rho_t^{\text{FR}})_{t\geq 0}$ is the FR gradient flow of the KL divergence.

7: In turn, we are able to provide a clean asymptotic convergence rate, where the burn-in time is guaranteed to be finite.

8: Our proof is based on observing a similarity between FR gradient flows and simulated annealing with linear scaling, and facts about cumulant generating functions.

9: We conclude with simple synthetic experiments that demonstrate our theoretical findings are indeed tight.

10: Based on our numerics, we conjecture that the asymptotic rates of convergence for Wasserstein-Fisher-Rao gradient flows are possibly related to this expansion in some cases.

11: \end{abstract}

12: