abstract:970a0657873b9687.tex

1: \begin{abstract}

2: 		In the current paper we provide constructive estimation of the

3: 		convergence rate for training a known class of neural networks: the

4: 		multi-class logistic regression.  Despite several decades of

5: 		successful use, our rigorous results appear new, reflective of the

6: 		gap between practice and theory of machine learning. Training a

7: 		neural network is typically done via variations of the gradient descent

8: 		method. If a minimum of the loss function exists and gradient

9: 		descent is used as the training method, we provide an expression

10: 		that relates learning rate to the rate of convergence to the

11: 		minimum. The method involves an estimate of the condition number of

12: 		the Hessian of the loss function. We also discuss the existence of a

13: 		minimum, as it is not automatic that a minimum exists. One method of

14: 		ensuring convergence is by assigning positive probabiity to every class

15: 		in the training

16: 		dataset.

17: 	\end{abstract}

18: