1: \begin{abstract}
2: For any given neural network architecture a permutation of weights
3: and biases results in the same functional network. This implies
4: that optimization algorithms used to `train' or `learn' the network
5: are faced with a very large number (in the millions even for small
6: networks) of equivalent optimal solutions in the parameter space. To
7: the best of our knowledge, this observation is absent in the literature.
8: In order to narrow down the parameter search space, a novel technique
9: is introduced in order to fix the bias vector configurations to be
10: monotonically increasing. This is achieved by augmenting a typical
11: learning problem with inequality constraints on the bias vectors in each
12: layer. A Moreau-Yosida regularization based algorithm is proposed to
13: handle these inequality constraints and a theoretical convergence of
14: this algorithm is established. Applications of the proposed approach
15: to standard trigonometric functions and more challenging stiff ordinary
16: differential equations arising in chemically reacting flows clearly
17: illustrate the benefits of the proposed approach.
18: Further application of the approach on the MNIST dataset within
19: TensorFlow, illustrate that the presented approach can be incorporated
20: in any of the existing machine learning libraries.
21:
22:
23: \end{abstract}
24: