0176d3a865ac8139.tex
1: \begin{abstract}
2: %What’s the domain?
3: The Neural Arithmetic Logic Unit (NALU) is a neural network layer that can learn exact arithmetic operations between the elements of a hidden state.
4: The goal of NALU is to learn perfect extrapolation, which requires learning the exact underlying logic of an unknown arithmetic problem.
5: %What’s the issue?
6: Evaluating the performance of the NALU is non-trivial as one arithmetic problem might have many solutions.
7: As a consequence, single-instance MSE has been used to evaluate and compare performance between models.
8: However, it can be hard to interpret what magnitude of MSE represents a correct solution and models sensitivity to initialization.
9: %What’s your contribution?
10: We propose using a success-criterion to measure if and when a model converges.
11: Using a success-criterion we can summarize success-rate over many initialization seeds and calculate confidence intervals.
12: We contribute a generalized version of the previous arithmetic benchmark to measure models sensitivity under different conditions.
13: %Why is it novel?
14: This is, to our knowledge, the first extensive evaluation with respect to convergence of the NALU and its sub-units.
15: %What’s interesting about it?
16: %An interesting finding is the high variability in convergence when modifying dataset parameters.
17: %How does it perform?
18: Using a success-criterion to summarize 4800 experiments we find that consistently learning arithmetic extrapolation is challenging, in particular for multiplication.
19: \ifdefined\nonanonymous\footnote{code for experiments is publicly available at: \url{https://github.com/AndreasMadsen/stable-nalu}.}\fi
20: \end{abstract}
21: