f75c1d53ef31e2fb.tex
1: \begin{abstract}
2:   This work is motivated by the engineering task of achieving a near
3:   state-of-the-art face recognition on a minimal computing budget
4:   running on an embedded system.  Our main technical contribution
5:   centers around a novel training method, called Multibatch, for
6:   similarity learning, i.e., for the task of generating an invariant
7:   ``face signature'' through training pairs of ``same'' and
8:   ``not-same'' face images. The Multibatch method first generates
9:   signatures for a mini-batch of $k$ face images and then constructs
10:   an unbiased estimate of the full gradient by relying on all $k^2-k$
11:   pairs from the mini-batch. We prove that the variance of the
12:   Multibatch estimator is bounded by $O(1/k^2)$, under some mild
13:   conditions. In contrast, the standard gradient estimator that relies
14:   on random $k/2$ pairs has a variance of order $1/k$. The smaller
15:   variance of the Multibatch estimator significantly speeds up the
16:   convergence rate of stochastic gradient descent.  Using the
17:   Multibatch method we train a deep convolutional neural network that
18:   achieves an accuracy of $98.2\%$ on the LFW benchmark, while its
19:   prediction runtime takes only $30$msec on a single ARM Cortex A9
20:   core. Furthermore, the entire training process took only 12 hours on
21:   a single Titan X GPU.
22: \end{abstract}
23: