7b49d4c38dad52eb.tex
1: \begin{abstract}
2: We study the problem of Byzantine fault-tolerance in a federated optimization setting, where there is a group of agents communicating with a centralized coordinator. We allow up to $f$ Byzantine-faulty agents, which may not follow a prescribed algorithm correctly, and may share arbitrary incorrect information with the coordinator. Associated with each non-faulty agent is a local cost function. The goal of the non-faulty agents is to compute a minimizer of their aggregate cost function. For solving this problem, we propose a local gradient-descent (GD) algorithm that incorporates a novel {\em comparative elimination} (CE) filter (aka.~aggregation scheme) to provably mitigate the detrimental impact of Byzantine faults. In the deterministic setting, when the agents can compute their local gradients accurately, our algorithm guarantees {\em exact} fault-tolerance against a bounded fraction of Byzantine agents, provided the non-faulty agents satisfy the known necessary condition of {\em $2f$-redundancy}. In the stochastic setting, when the agents can only compute stochastic estimates of their gradients, our algorithm guarantees {\em approximate} fault-tolerance where the approximation error is proportional to the variance of stochastic gradients and the fraction of Byzantine agents.
3: % Finally, we illustrate the efficacy of our scheme through numerical experiments on a robust mean estimation problem, which is a special case of the Byzantine fault-tolerance problem.
4: % propose a distributed local stochastic gradient descent algorithm with provable {\em exact fault-tolerance} against a bounded fraction of faulty agents, provided the non-faulty agents have the necessary property named {\em $2f$-redundancy}. In addition, we provide an explicit formula to characterize the finite-time convergence rate of the proposed methods. 
5: \end{abstract}
6: