14f657512dd5ed10.tex
1: \begin{abstract}
2: For any given neural network architecture a permutation of weights 
3: and biases results in the same functional network. This implies
4: that optimization algorithms used to `train' or `learn' the network
5: are faced with a very large number (in the millions even for small
6: networks) of equivalent optimal solutions in the parameter space. To 
7: the best of our knowledge, this observation is absent in the literature.
8: In order to narrow down the parameter search space, a novel technique 
9: is introduced in order to fix the bias vector configurations to be 
10: monotonically increasing. This is achieved by augmenting a typical 
11: learning problem with inequality constraints on the bias vectors in each 
12: layer. A Moreau-Yosida regularization based algorithm is proposed to 
13: handle these inequality constraints and a theoretical convergence of 
14: this algorithm is established. Applications of the proposed approach 
15: to standard trigonometric functions and more challenging stiff ordinary 
16: differential equations arising in chemically reacting flows clearly 
17: illustrate the benefits of the proposed approach.  
18: Further application of the approach on the MNIST dataset within 
19: TensorFlow, illustrate that the presented approach can be incorporated 
20: in any of the existing machine learning libraries.
21: 
22:  
23: \end{abstract}
24: