abstract:81683a337a64090e.tex

1: \begin{abstract}

2: Due to the size of data and the limited data storage space in a single local computer, data can often be stored in a distributed manner. In order to use the distributed big data in machine learning, performing large-scale machine learning from the distributed data through communication networks is inevitable. Also, unlike the internal communication links such as a path between a memory to a Central Processing Unit (CPU) or Graphics Processing Unit (GPU) in a single computer, the  communication delays in network communication links connecting distributed computers and servers can be a big burden in performing large-scale machine learning from distributed data. Additionally, the communication networks can be organized in various topologies including a star, a tree, and a ring, etc. Therefore, in this paper, we consider the problem about how the network communication constraints will impact the convergence speed of distributed machine learning optimization algorithms. Firstly, we study the convergence rate of the distributed dual coordinate ascent in a general tree structured network, since every connected communication network can have a spanning tree, and a tree network can be understood as the generalization of a star network. And then, by considering network communication delays, we optimize the network-constrained dual coordinate ascent to maximize its convergence speed in terms of operation time. Through numerical experiments, we demonstrate that under different network communication delays, the delay-dependent number of local and global iterations in distributed dual coordinated ascent can play a significant role in the achievement of maximum convergence speed.

3: \end{abstract}

4: