1: \begin{abstract}
2: Federated learning (FL) empowers privacy-preservation in model training by only exposing users' model gradients. Yet, FL users are susceptible to the gradient inversion (GI) attack which can reconstruct ground-truth training data such as images based on model gradients. However, reconstructing high-resolution images by existing GI attack works faces two challenges: inferior accuracy and slow-convergence, especially when the context is complicated, \emph{e.g.}, the training batch size is much greater than 1 on each FL user. To address these challenges, we present a $\textbf{R}$obust, $\textbf{A}$ccurate and $\textbf{F}$ast-convergent $\textbf{GI}$ attack algorithm, called $\textbf{RAF-GI}$, with two components: 1) $\textbf{A}$dditional $\textbf{C}$onvolution $\textbf{B}$lock ($\textbf{ACB}$) which can restore labels with up to 20\% improvement compared with existing works; 2) $\textbf{T}$otal variance, three-channel m$\textbf{E}$an and c$\textbf{A}$nny edge detection regularization term ($\textbf{TEA}$), which is a white-box attack strategy to reconstruct images based on labels inferred by $\textbf{ACB}$. Moreover, $\textbf{RAF-GI}$ is robust that can still accurately reconstruct ground-truth data when the users' training batch size is no more than 48. Our experimental results manifest that $\textbf{RAF-GI}$ can diminish 94\% time costs while achieving superb inversion quality in ImageNet dataset. Notably, with a batch size of 1, $\textbf{RAF-GI}$ exhibits a 7.89 higher Peak Signal-to-Noise Ratio (PSNR) compared to the state-of-the-art baselines.
3: % {\bf YP: why we need to highlight the case with batch size=1? It is better to use the case with batch size=48 here?}
4: \end{abstract}
5: