abstract:9dab81a22a4e4ba9.tex

1: \begin{abstract}

2:   \noindent

3:   In this paper we develop a statistical theory and an implementation of deep

4:   learning (DL) models. We show that an  elegant variable splitting scheme for

5:   the alternating direction method of multipliers (ADMM) optimises a deep

6:   learning objective.  We allow for non-smooth non-convex regularisation

7:   penalties to induce sparsity in parameter weights.  We provide a link between

8:   traditional shallow layer statistical models such as principal component and

9:   sliced inverse regression and deep layer models. We also define the degrees

10:   of freedom of a deep learning predictor and a predictive MSE criteria to

11:   perform model selection for comparing architecture designs. We focus on deep

12:   multi-class logistic learning although our methods apply more generally.  Our

13:   results suggest an interesting and previously under-exploited relationship

14:   between deep learning and proximal splitting techniques.  To illustrate our

15:   methodology, we provide a multi-class logit classification analysis of Fisher's Iris data  where we illustrate

16:   the convergence of our algorithm.  Finally, we conclude

17:   with directions for future research.

18:

19:   \vspace{0.1in}

20:   \noindent Keywords: Deep Learning, Sparsity, Dropout, Convolutional Neural Nets;

21:   Regularisation; Bayesian MAP; Image Segmentation; Classification; Multi-class Logistic regression.

22: \end{abstract}

23: