179c65c77a42b6e8.tex
1: \begin{abstract}
2: Multilayer Perceptrons struggle to learn certain simple arithmetic tasks. 
3: Specialist neural modules for arithmetic can outperform classical architectures with gains in extrapolation, interpretability and convergence speeds, but are highly sensitive to the training range. 
4: In this paper, we show that Neural Multiplication Units (NMUs) are unable to reliably learn tasks as simple as multiplying two inputs when given different training ranges. 
5: Causes of failure are linked to inductive and input biases which encourage convergence to solutions in undesirable optima. 
6: A solution, the stochastic NMU (sNMU), is proposed to apply reversible stochasticity, encouraging avoidance of such optima whilst converging to the true solution. 
7: Empirically, we show that stochasticity provides improved robustness with the potential to improve learned representations of upstream networks for numerical and image tasks. 
8: \end{abstract}
9: