abstract:ce7d4a16a678e250.tex

1: \begin{abstract}%

2:   Langevin algorithms are gradient descent methods with additive noise. They have been used for decades in Markov Chain Monte Carlo (MCMC) sampling, optimization, and learning.

3: %

4: Their convergence properties for unconstrained non-convex optimization and learning problems have been studied widely in the last few years.

5: %

6: Other work has examined projected Langevin algorithms for sampling from log-concave distributions restricted to convex compact sets.

7: For learning and optimization, log-concave distributions correspond to convex losses.

8: %

9: In this paper, we analyze the case of non-convex losses with compact convex constraint sets and IID external data variables. We term the resulting method the projected stochastic gradient Langevin algorithm (PSGLA).

10: %

11: %

12: We show the algorithm achieves a deviation of $O(T^{-1/4}(\log T)^{1/2})$ from its target distribution in $1$-Wasserstein distance.

13: %

14: For optimization and learning, we show that the algorithm achieves $\epsilon$-suboptimal solutions, on average, provided that it is run for a time that is polynomial in $\epsilon^{-1}$ and slightly super-exponential in the problem dimension.

15: %

16:

17: \end{abstract}

18: