abstract:11c3910a1db4e203.tex

1: \begin{abstract}

2: Synchronous \ac{fl} is a popular paradigm for collaborative edge learning. It typically involves a set of heterogeneous devices locally training \ac{nn} models in parallel with periodic centralized aggregations. As some of the devices may have limited computational resources and varying availability, \ac{fl} latency is highly sensitive to stragglers. Conventional approaches  discard incomplete intra-model updates done by stragglers, alter the amount of local workload and architecture, or resort to asynchronous settings; which all affect the trained model performance

3: % of the trained model when operating

4: under tight training latency constraints.

5: In this work, we propose {\em \ac{salf}} that leverages the optimization procedure of \acp{nn} via backpropagation to update the global model in a {\em layer-wise} fashion. \ac{salf} allows stragglers to synchronously convey partial gradients, having each layer of the global model be updated independently with a different contributing set of users. We provide a theoretical analysis, establishing convergence guarantees for the global model under mild assumptions on the distribution of the participating devices,  revealing that \ac{salf}

6: %Our analysis reveals that \ac{salf}-aided \ac{fl} of \acp{nn} is proven to

7: converges at the same asymptotic rate as

8: \ac{fl} with no timing limitations. This insight is matched with empirical observations, demonstrating the performance gains of \ac{salf} compared to alternative mechanisms mitigating the device heterogeneity gap in \ac{fl}.

9: %, and its ability to facilitate synchronous \ac{fl} with low latency constraints without notably affecting the utility of the learned model.

10: \end{abstract}

11: