1: \begin{abstract}
2: Comparative evaluation lies at the heart of science, and determining the
3: accuracy of a computational method is crucial for evaluating its potential as
4: well as for guiding future efforts. However, metrics that are typically used
5: have inherent shortcomings when faced with the under-resolved solutions of
6: real-world simulation problems. We show how to leverage crowd-sourced user
7: studies in order to address the fundamental problems of widely used classical
8: evaluation metrics. We demonstrate that such user studies, which inherently
9: rely on the human visual system, yield a very robust metric and consistent
10: answers for complex phenomena without any requirements for proficiency
11: regarding the physics at hand. This holds even for cases away from convergence
12: where traditional metrics often end up inconclusive results. More
13: specifically, we evaluate results of different \ac{eno} schemes in different
14: fluid flow settings. Our methodology represents a novel and practical approach
15: for scientific evaluations that can give answers for previously unsolved
16: problems.
17: \end{abstract}
18: