abstract:449315ddd8623942.tex

1: \begin{abstract}

2: Federated learning (FL) enables on-device training over distributed networks consisting of a massive amount of modern smart devices, such as smartphones and IoT~(Internet of Things) devices.

3: However, the leading optimization algorithm in such settings, i.e., \emph{federated averaging} (FedAvg), suffers from heavy communication costs and the inevitable performance drop, especially when the local data is distributed in a non-IID way.

4: To alleviate this problem, we propose two potential solutions by introducing additional mechanisms to the on-device training.

5:

6: The first (FedMMD) is adopting a two-stream model with the MMD (Maximum Mean Discrepancy) constraint instead of a single model in vanilla FedAvg to be trained on devices.

7: Experiments show that the proposed method outperforms baselines, especially in non-IID FL settings, with a reduction of more than 20\% in required communication rounds.

8:

9: The second is FL with feature fusion (FedFusion).

10: By aggregating the features from both the local and global models, we achieve higher accuracy at fewer communication costs.

11: Furthermore, the feature fusion modules offer better initialization for newly incoming clients and thus speed up the process of convergence.

12: Experiments in popular FL scenarios show that our FedFusion outperforms baselines in both accuracy and generalization ability while reducing the number of required communication rounds by more than 60\%.

13:

14: \end{abstract}

15: