abstract:1983cd190b8bc1a6.tex

1: \begin{abstract}

2:

3: Limited computation and communication capabilities of clients pose significant challenges in federated learning (FL) over resource-limited edge nodes.

4: A potential solution to this problem is to deploy off-the-shelf sparse learning algorithms that train a binary sparse mask on each client with the expectation of training a consistent sparse server mask yielding sparse weight tensors.

5: However, as we investigate in this paper, such naive deployments result in a significant drop in accuracy compared to FL with dense models, especially for clients with limited resource budgets.

6: In particular, our investigations reveal a serious lack of consensus among the trained sparsity masks on clients, which prevents convergence for the server mask and potentially leads to a substantial drop in model performance.

7: Based on such key observations, we propose \textit{federated lottery aware sparsity hunting} (FLASH), a unified sparse learning framework to make the server win a lottery in terms of yielding a sparse sub-model, able to maintain classification performance under highly resource-limited client settings.

8: Moreover, to support FL on different devices requiring different parameter density, we leverage our findings to present \textit{hetero-FLASH}, where clients can have different target sparsity budgets based on their device resource limits.

9: Experimental evaluations with multiple models on various datasets (both IID and non-IID) show superiority of our models in closing the gap with unpruned baseline while yielding up to $\mathord{\sim}10.1\%$ improved accuracy with $\mathord{\sim}10.26\times$  fewer communication costs, compared to existing alternatives, at similar hyperparameter settings.

10:

11: \end{abstract}

12: