abstract:c17f9462e068491c.tex

1: \begin{abstract}

2: {Current} deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate  classification.

3:  To defend against such attacks, an effective and popular approach, known as \textit{adversarial training (AT)}, has been shown to mitigate the {negative} impact of adversarial attacks by virtue of a min-max robust training method.

4:  While effective, it remains unclear whether it can successfully be adapted to the distributed learning context.

5:  The power of distributed optimization over multiple machines

6:   enables us to scale up robust training over large models and  datasets. Spurred by that,

7:  we propose

8:  \textit{distributed adversarial training ({DAT})},

9:  a \textit{large-batch} adversarial training framework implemented over multiple machines. We show that {DAT} is general, which supports training over labeled and unlabeled data,

10: multiple types of attack generation methods, and   gradient compression operations favored for distributed optimization.

11:  Theoretically, we provide, under standard conditions in the optimization theory, the convergence rate of {DAT} to the first-order stationary points in general non-convex settings. Empirically, we demonstrate that {DAT} either matches or outperforms state-of-the-art robust accuracies and achieves a graceful training

12:  speedup (e.g., on ResNet--50 under ImageNet). Codes are available at \url{https://github.com/dat-2022/dat}.

13: \end{abstract}

14: