5a577b05cb54fd61.tex
1: \begin{abstract} 
2: In~\citep{Yangnips13}, the author presented distributed stochastic dual coordinate ascent (DisDCA) algorithms for solving large-scale regularized loss minimization.  Extraordinary performances have been observed and reported for the well-motivated updates, as referred to the practical updates,  compared to the naive updates. However, no serious analysis has been provided to understand the updates and therefore the convergence rates. In the paper, we bridge the gap by providing a theoretical analysis of the convergence rates of the practical 
3: DisDCA algorithm. Our analysis helped by empirical studies has  shown that it could yield an exponential speed-up in the convergence by increasing the number of dual updates at each iteration. This result justifies the superior performances of the practical DisDCA as compared to the naive variant. As a byproduct, our analysis also   reveals the  convergence behavior of the one-communication DisDCA.%In particular, we show that when data on different machines are orthogonal to each other, the convergence rate of the practical DisDCA is better by a factor of $mK$ than SDCA,  where $m$ is the number of examples updated at each iteration and $K$ is the number of machines. For general cases, our analysis aided by empirical studies demonstrates that the convergence rate gains as close as $m$-times speed-up as expected. %Moreover, our analysis inspire us to combine DisDCA with the dual random projection technique for solving large-scale high dimensional optimization problems. 
4: \end{abstract}