ce7d4a16a678e250.tex
1: \begin{abstract}%
2:   Langevin algorithms are gradient descent methods with additive noise. They have been used for decades in Markov Chain Monte Carlo (MCMC) sampling, optimization, and learning. 
3: %
4: Their convergence properties for unconstrained non-convex optimization and learning problems have been studied widely in the last few years.
5: %
6: Other work has examined projected Langevin algorithms for sampling from log-concave distributions restricted to convex compact sets. 
7: For learning and optimization, log-concave distributions correspond to convex losses.
8: %
9: In this paper, we analyze the case of non-convex losses with compact convex constraint sets and IID external data variables. We term the resulting method the projected stochastic gradient Langevin algorithm (PSGLA).
10: %
11: %
12: We show the algorithm achieves a deviation of $O(T^{-1/4}(\log T)^{1/2})$ from its target distribution in $1$-Wasserstein distance.
13: %
14: For optimization and learning, we show that the algorithm achieves $\epsilon$-suboptimal solutions, on average, provided that it is run for a time that is polynomial in $\epsilon^{-1}$ and slightly super-exponential in the problem dimension. 
15: % 
16: 
17: \end{abstract}
18: