1: \begin{abstract}
2: We consider federated learning in tiered communication networks.
3: Our network model consists of a set of silos, each holding a vertical partition of the data. Each silo contains a hub and a set of clients, with the silo's vertical data shard partitioned horizontally across its clients.
4: We propose Tiered Decentralized Coordinate Descent (TDCD), a communication-efficient decentralized training algorithm for such two-tiered networks.
5: The clients in each silo perform multiple local gradient steps before sharing updates with their hub to reduce communication overhead.
6: Each hub adjusts its coordinates by averaging its workers' updates, and then hubs exchange intermediate updates with one another.
7: We present a theoretical analysis of our algorithm and show the dependence of the convergence rate on the number of vertical partitions and the number of local updates.
8: We further validate our approach empirically via simulation-based experiments using a variety of datasets and objectives.
9: \end{abstract}
10: