abstract:cecbe69ce3549f0c.tex

1: \begin{abstract}

2: Federated learning heavily relies on distributed gradient descent techniques. In the situation where gradient information is not available, the gradients need to be estimated from zeroth-order information, which typically involves computing finite-differences along isotropic random directions.

3: This method suffers from high estimation errors, as the geometric features of the objective landscape may be overlooked during the isotropic sampling.

4: In this work, we propose a non-isotropic sampling method to improve the gradient estimation procedure.

5: Gradients in our method are estimated in a subspace spanned by historical trajectories of solutions, aiming to encourage the exploration of promising regions and hence improve the convergence.

6: The proposed method uses a covariance matrix for sampling which is a convex combination of two parts. The first part is a thin projection matrix containing the basis of the subspace which is designed to improve the exploitation ability. The second part is the historical trajectories.

7: We implement this method in zeroth-order federated settings, and show that the convergence rate aligns with existing ones while introducing no significant overheads in communication or local computation.

8: The effectiveness of our proposal is verified on several numerical experiments in comparison to several commonly-used zeroth-order federated optimization algorithms.

9: \end{abstract}

10: