d8c0b1eaa09add93.tex
1: \begin{abstract}
2: Markov Chain Monte Carlo (MCMC) methods such as Gibbs sampling are finding widespread use in applied statistics and machine learning.
3: These often require significant computational power, and are increasingly being deployed on parallel and distributed systems such as compute clusters.
4: Recent work has proposed running iterative algorithms such as gradient descent and MCMC in parallel \emph{asynchronously} for increased performance, with good empirical results in certain problems.
5: Unfortunately, for MCMC this parallelization technique requires new convergence theory, as it has been explicitly demonstrated to lead to divergence on some examples.
6: Recent theory on \emph{Asynchronous Gibbs sampling} describes why these algorithms can fail, and provides a way to alter them to make them converge.
7: In this article, we describe how to apply this theory in a generic setting, to understand the asynchronous behavior of any MCMC algorithm, including those implemented using parameter servers, and those not based on Gibbs sampling. 
8: \\\strut\\
9: \textbf{Keywords:} Bayesian statistics, big data, Gibbs sampling, iterative algorithm, Metropolis-Hastings algorithm, parallel and distributed systems, parameter server.
10: \end{abstract}
11: