abstract:d90879294e265b24.tex

1: \begin{abstract}

2: In machine learning or scientific computing, model performance is measured with an objective function.

3: But why choose one objective over another?

4: Information theory gives one answer:

5: \textit{To maximize the information in the model, select the objective function that represents the error in the fewest bits}.

6: \remove{Extraneous bits are noise, which can cause slower convergence and suboptimal solutions.}

7: To evaluate different objectives, transform them into likelihood functions.

8: As likelihoods, their relative magnitude represents how strongly we should prefer one objective versus another,

9: and the log of that relation represents the difference in their bit-length, as well as the difference in their uncertainty.

10: \add{In other words, prefer whichever objective minimizes the uncertainty.

11: Under the information-theoretic paradigm, the ultimate objective is to maximize information (and minimize uncertainty),

12: as opposed to any specific utility.

13: We argue that this paradigm is well-suited to models

14: that have many uses and no definite utility,

15: like the large Earth system models used to understand the effects of climate change.

16: }

17: \end{abstract}

18: