d90879294e265b24.tex
1: \begin{abstract}
2: In machine learning or scientific computing, model performance is measured with an objective function.
3: But why choose one objective over another?
4: Information theory gives one answer:
5: \textit{To maximize the information in the model, select the objective function that represents the error in the fewest bits}.
6: \remove{Extraneous bits are noise, which can cause slower convergence and suboptimal solutions.}
7: To evaluate different objectives, transform them into likelihood functions.
8: As likelihoods, their relative magnitude represents how strongly we should prefer one objective versus another,
9: and the log of that relation represents the difference in their bit-length, as well as the difference in their uncertainty.
10: \add{In other words, prefer whichever objective minimizes the uncertainty.
11: Under the information-theoretic paradigm, the ultimate objective is to maximize information (and minimize uncertainty),
12: as opposed to any specific utility.
13: We argue that this paradigm is well-suited to models
14: that have many uses and no definite utility, 
15: like the large Earth system models used to understand the effects of climate change.
16: }
17: \end{abstract}
18: