abstract:30517d87ba1d1a4a.tex

1: \begin{abstract}

2:   We analyze the topological properties of the set of

3:   functions that can be implemented by neural networks of a fixed size.

4:   Surprisingly, this set has many undesirable properties.

5:   It is highly non-convex, except possibly for a few exotic activation functions.

6:   Moreover, the set is not closed with respect to $L^p$-norms,

7:   $0 < p < \infty$, for all practically-used activation functions, and also not

8:   closed with respect to the $L^\infty$-norm for all practically-used activation

9:   functions except for the ReLU and the parametric ReLU.

10:   Finally, the function that maps a family of weights to the function computed by the

11:   associated network is not inverse stable for every practically used activation function.

12:   In other words, if $f_1, f_2$ are two functions realized by neural networks

13:   and if $f_1, f_2$ are close in the sense that $\|f_1 - f_2\|_{L^\infty} \leq \eps$ for $\eps > 0$,

14:   it is, regardless of the size of $\eps$, usually not possible to find weights $w_1, w_2$

15:   close together such that each $f_i$ is realized by a neural network with weights $w_i$.

16:   Overall, our findings identify potential causes for issues in the

17:   training procedure of deep learning such as no guaranteed convergence,

18:   explosion of parameters, and slow convergence.

19: \end{abstract}

20: