14458fb30a956d3e.tex
1: \begin{abstract}
2: Although machine learning models typically experience a drop in performance on out-of-distribution data, accuracies on in- versus out-of-distribution data are widely observed to follow a single linear trend when evaluated across a testbed of models. 
3: Models that are more accurate on the out-of-distribution data relative to this baseline exhibit “effective robustness” and are exceedingly rare. 
4: Identifying such models, and understanding their properties, is key to improving out-of-distribution performance. 
5: We conduct a thorough empirical investigation of effective robustness during fine-tuning and surprisingly find that models pre-trained on larger datasets exhibit  effective robustness during training that vanishes at convergence. 
6: We study how properties of the data influence effective robustness, and we show that it increases with the larger size, more diversity, and higher example difficulty of the dataset. 
7: We also find that models that display effective robustness are able to correctly classify 10\% of the examples that no other current testbed model gets correct. 
8: Finally, we discuss several strategies for scaling effective robustness to the high-accuracy regime to improve the out-of-distribution accuracy of state-of-the-art models.
9: \end{abstract}
10: