1: \begin{abstract}
2: % Learning-based applications have demonstrated practical use cases in ubiquitous environments and amplified interest in exploiting the data stored on users’ mobile devices. Distributed optimization algorithms aim to leverage such distributed and diverse data to learn a global phenomena by performing training amongst participating devices and repeatedly aggregating their local models’ parameters into a global model. Federated Averaging is a promising solution that allows for extending local training before aggregating the parameters, offering better communication efficiency. However, in the cases where the participants’ data are strongly skewed (i.e., local distributions are different), the model accuracy can significantly drop. To face this challenge, we leverage the edge computing paradigm to design a hierarchical learning system that performs Federated Gradient Descent on the user-edge layer and Federated Averaging on the edge-cloud layer. In this hierarchical architecture, the users might be assigned to different edges, leading to different edge-level data distributions. We formalize and optimize this user-edge assignment problem to minimize classes’ distribution distance between edge nodes, which enhances the Federated Averaging performance. Our experiments on multiple real datasets show that the proposed optimized assignment is tractable and leads to faster convergence of models towards a better accuracy value.
3: % \end{abstract}
4: