abstract:a0c3203ab9861a3f.tex

1: \begin{abstract}

2: Despite the great successes achieved by deep neural networks (DNNs), recent studies show that they are vulnerable against adversarial examples, which aim to mislead DNNs by adding small adversarial perturbations.

3: Several defenses have been proposed against such attacks, while many of them have been adaptively attacked.

4: In this work, we aim to enhance the ML robustness from a different perspective by leveraging \textit{domain knowledge}:

5: We propose a \framework (\sys) to integrate domain knowledge (i.e., logic relationships among different predictions) into a probabilistic graphical model via first-order logic rules.

6: In particular, we develop \sys by integrating a diverse set of weak auxiliary models based on their logical relationships to the main DNN model that performs the target task.

7: Theoretically, we provide convergence results and prove that, under mild conditions, the prediction of \sys is more robust than that of the main DNN model.

8: Empirically, we take road sign recognition as an example and leverage the relationships between road signs and their shapes and contents as domain knowledge.

9: We show that compared with adversarial training and other baselines, \sys achieves higher robustness against physical attacks, $\mathcal{L}_p$ bounded attacks, unforeseen attacks, and natural corruptions under both whitebox and blackbox settings, while still maintaining high clean accuracy.

10: \end{abstract}

11: