abstract:cabf0d4c329949c2.tex

1: \begin{abstract}

2: This paper proposes a fully decentralized \gls{FL} scheme for \gls{IoE} devices that are connected via multi-hop networks.

3: Because \gls{FL} algorithms hardly converge the parameters of \gls{ML} models,

4: this paper focuses on the convergence of \gls{ML} models in \textit{function spaces}.

5: Considering that the representative loss functions of \gls{ML} tasks e.g., \gls{MSE} and \gls{KL} divergence, are convex \textit{functionals},

6: algorithms that directly update functions in function spaces could converge to the optimal solution.

7: The key concept of this paper is

8: to tailor a consensus-based optimization algorithm to work in the function space

9: and achieve the global optimum in a distributed manner.

10: This paper first analyzes the convergence of the proposed algorithm in a function space, which is referred to as a meta-algorithm,

11: and shows that the spectral graph theory can be applied to the function space in a manner similar to that of numerical vectors.

12: Then, \gls{CMFD} is developed for a \gls{NN} to implement the meta-algorithm.

13: \Gls{CMFD} leverages knowledge distillation to realize function aggregation among adjacent devices without parameter averaging.

14: An advantage of \gls{CMFD} is that it works even with different \gls{NN} models among the distributed learners.

15: Although \gls{CMFD} does not perfectly reflect the behavior of the meta-algorithm,

16: the discussion of the meta-algorithm's convergence property promotes an intuitive understanding of \gls{CMFD},

17: and simulation evaluations show that \gls{NN} models converge using \gls{CMFD} for several tasks.

18: The simulation results also show that \gls{CMFD} achieves higher accuracy than parameter aggregation for weakly connected networks,

19: and \gls{CMFD} is more stable than parameter aggregation methods.

20: \end{abstract}

21: