1: \begin{abstract}
2: This work is motivated by the engineering task of achieving a near
3: state-of-the-art face recognition on a minimal computing budget
4: running on an embedded system. Our main technical contribution
5: centers around a novel training method, called Multibatch, for
6: similarity learning, i.e., for the task of generating an invariant
7: ``face signature'' through training pairs of ``same'' and
8: ``not-same'' face images. The Multibatch method first generates
9: signatures for a mini-batch of $k$ face images and then constructs
10: an unbiased estimate of the full gradient by relying on all $k^2-k$
11: pairs from the mini-batch. We prove that the variance of the
12: Multibatch estimator is bounded by $O(1/k^2)$, under some mild
13: conditions. In contrast, the standard gradient estimator that relies
14: on random $k/2$ pairs has a variance of order $1/k$. The smaller
15: variance of the Multibatch estimator significantly speeds up the
16: convergence rate of stochastic gradient descent. Using the
17: Multibatch method we train a deep convolutional neural network that
18: achieves an accuracy of $98.2\%$ on the LFW benchmark, while its
19: prediction runtime takes only $30$msec on a single ARM Cortex A9
20: core. Furthermore, the entire training process took only 12 hours on
21: a single Titan X GPU.
22: \end{abstract}
23: