abstract:14458fb30a956d3e.tex

1: \begin{abstract}

2: Although machine learning models typically experience a drop in performance on out-of-distribution data, accuracies on in- versus out-of-distribution data are widely observed to follow a single linear trend when evaluated across a testbed of models.

3: Models that are more accurate on the out-of-distribution data relative to this baseline exhibit “effective robustness” and are exceedingly rare.

4: Identifying such models, and understanding their properties, is key to improving out-of-distribution performance.

5: We conduct a thorough empirical investigation of effective robustness during fine-tuning and surprisingly find that models pre-trained on larger datasets exhibit  effective robustness during training that vanishes at convergence.

6: We study how properties of the data influence effective robustness, and we show that it increases with the larger size, more diversity, and higher example difficulty of the dataset.

7: We also find that models that display effective robustness are able to correctly classify 10\% of the examples that no other current testbed model gets correct.

8: Finally, we discuss several strategies for scaling effective robustness to the high-accuracy regime to improve the out-of-distribution accuracy of state-of-the-art models.

9: \end{abstract}

10: