1: \begin{abstract}
2: %Among dissimilarities between probability distributions, the Kernel Stein Discrepancy (KSD) has shown promising results for goodness-of-fit tests.
3: Among dissimilarities between probability distributions, the Kernel Stein Discrepancy (KSD) has received much interest recently.
4: We investigate the properties of its Wasserstein gradient flow to approximate a target probability distribution $\pi$ on $\mathbb{R}^d$, known up to a normalization constant.
5: This leads to a straightforwardly implementable, deterministic score-based method to sample from $\pi$, named KSD Descent, which uses a set of particles to approximate $\pi$.
6: Remarkably, owing to a tractable loss function, KSD Descent can leverage robust parameter-free optimization schemes such as L-BFGS; this contrasts with other popular particle-based schemes such as the Stein Variational Gradient Descent algorithm.
7: We study the convergence properties of KSD Descent and demonstrate its practical relevance.
8: However, we also highlight failure cases by showing that the algorithm can get stuck in spurious local minima.%\ak{I would stop the sentence here}, even when $\pi$ is a simple mixture of Gaussians with isolated components.
9: \end{abstract}
10: