1: \begin{abstract}
2: This paper presents a compact, matrix-based representation of neural networks in a self-contained tutorial fashion.
3: %
4: % Specifically, we develop neural networks as a composition of several vector-valued functions.
5: %
6: Although neural networks are well-understood pictorially in terms of interconnected neurons, neural networks are mathematical nonlinear functions constructed by composing several vector-valued functions.
7: %
8: Using basic results from linear algebra, we represent a neural network as an alternating sequence of linear maps and scalar nonlinear functions, also known as activation functions.
9: % , which are parameterized by matrix multiplications, and nonlinear maps.
10: %
11: The training of neural networks requires the minimization of a cost function, which in turn requires the computation of a gradient.
12: %
13: Using basic multivariable calculus results, the cost gradient is also shown to be a function composed of a sequence of linear maps and nonlinear functions.
14: % also known as backpropagation.
15: %
16: In addition to the analytical gradient computation, we consider two gradient-free training methods and compare the three training methods in terms of convergence rate and prediction accuracy.
17: % Three gradient-free training schemes are implemented and the results of training are compared with the gradient-based training.
18: % Finally, a novel data-drive, gradient free training algorithm is presented and compared.
19:
20: \end{abstract}
21: