abstract:bbc6fa7aa61a600c.tex

1: \begin{abstract}

2: Convolution is a central operation in Convolutional Neural Networks (CNNs), which applies a kernel to overlapping regions shifted across the image.

3: However, because of the strong  correlations in real-world image data, convolutional kernels are in effect re-learning redundant data.

4: In this work, we show that this redundancy has made neural network training challenging, and propose \emph{network deconvolution}, a procedure which optimally removes pixel-wise and channel-wise correlations before the data is fed into each layer. Network deconvolution can be efficiently calculated at a fraction of the computational cost of a convolution layer. We also show that the deconvolution filters in the first layer of the network resemble the center-surround structure found in biological neurons in the visual regions of the brain.

5: Filtering with such kernels results in a sparse representation, a desired property that has been missing in the training of neural networks. Learning from the sparse representation promotes faster convergence and superior results \textit{without} the use of batch normalization. We apply our network deconvolution operation to $10$ modern neural network models by replacing batch normalization within each. Extensive experiments show that the network deconvolution operation is able to deliver performance improvement in all cases on the CIFAR-10, CIFAR-100, MNIST, Fashion-MNIST, Cityscapes, and ImageNet datasets.

6:

7:

8: \end{abstract}

9: